Fully phased allele-level sequencing of highly polymorphic HLA genes is greatly facilitated by SMRT Sequencing technology. In the present work, we have evaluated multiple DNA barcoding strategies for multiplexing several loci from multiple individuals, using three different tagging methods. Specifically MHC class I genes HLA-A, -B, and –C were indexed via DNA Barcodes by either tailed primers or barcoded SMRTbell adapters. Eight different 16-bp barcode sequences were used in symmetric & asymmetric pairing. Eight DNA barcoded adapters in symmetric pairing were independently ligated to a pool of HLA-A, -B and –C for eight different individuals, one at a time and pooled for sequencing on a single SMRT Cell. Amplicons generated from barcoded primers were pooled upfront for library generation. Eight symmetric barcoded primers were generated for HLA class I genes. These primers facilitated multiplexing of 8 samples and also allowed generation of unique asymmetric pairings for simultaneous amplification from 28 reference genomic DNA samples. The data generated from all 3 methods was analyzed using LAA protocol in SMRT analysis V2.3. Consensus sequences generated were typed using GenDx NGS engine HLA-typing software.
Development of high-throughput sequencing techniques have greatly benefited our understanding about microbial ecology; yet the methods producing short reads suffer from species-level resolution and uncertainty of identification. Here we optimize PacBio-based metabarcoding protocols covering the Internal Transcribed Spacer (ITS region) and partial Small Subunit (SSU) of the rRNA gene for species-level identification of all eukaryotes, with a specific focus on Fungi (including Glomeromycota) and Stramenopila (particularly Oomycota). Based on tests on composite soil samples and mock communities, we propose best suitable degenerate primers, ITS9munngs + ITS4ngsUni for eukaryotes and selected groups therein and discuss pros and cons of long read-based identification of eukaryotes. This article is protected by copyright. All rights reserved.
Relative Performance of MinION (Oxford Nanopore Technologies) versus Sequel (Pacific Biosciences) Third-Generation Sequencing Instruments in Identification of Agricultural and Forest Fungal Pathogens.
Culture-based molecular identification methods have revolutionized detection of pathogens, yet these methods are slow and may yield inconclusive results from environmental materials. The second-generation sequencing tools have much-improved precision and sensitivity of detection, but these analyses are costly and may take several days to months. Of the third-generation sequencing techniques, the portable MinION device (Oxford Nanopore Technologies) has received much attention because of its small size and possibility of rapid analysis at reasonable cost. Here, we compare the relative performances of two third-generation sequencing instruments, MinION and Sequel (Pacific Biosciences), in identification and diagnostics of fungal and oomycete pathogens from conifer (Pinaceae) needles and potato (Solanum tuberosum) leaves and tubers. We demonstrate that the Sequel instrument is efficient for metabarcoding of complex samples, whereas MinION is not suited for this purpose due to a high error rate and multiple biases. However, we find that MinION can be utilized for rapid and accurate identification of dominant pathogenic organisms and other associated organisms from plant tissues following both amplicon-based and PCR-free metagenomics approaches. Using the metagenomics approach with shortened DNA extraction and incubation times, we performed the entire MinION workflow, from sample preparation through DNA extraction, sequencing, bioinformatics, and interpretation, in 2.5 h. We advocate the use of MinION for rapid diagnostics of pathogens and potentially other organisms, but care needs to be taken to control or account for multiple potential technical biases.IMPORTANCE Microbial pathogens cause enormous losses to agriculture and forestry, but current combined culturing- and molecular identification-based detection methods are too slow for rapid identification and application of countermeasures. Here, we develop new and rapid protocols for Oxford Nanopore MinION-based third-generation diagnostics of plant pathogens that greatly improve the speed of diagnostics. However, due to high error rate and technical biases in MinION, the Pacific BioSciences Sequel platform is more useful for in-depth amplicon-based biodiversity monitoring (metabarcoding) from complex environmental samples.Copyright © 2019 American Society for Microbiology.
DNA barcoding has been used for decades, although it has mostly been applied to somesingle-species. Traditional Chinese medicine (TCM), which is mainly used in the form ofcombination-one type of the multi-species, identification is crucial for clinical usage.Next-generation Sequencing (NGS) has been used to address this authentication issue for the pastfew years, but conventional NGS technology is hampered in application due to its short sequencingreads and systematic errors. Here, a novel method, Full-length multi-barcoding (FLMB) vialong-read sequencing, is employed for the identification of biological compositions in herbalcompound formulas in adequate and well controlled studies. By directly sequencing the full-lengthamplicons of ITS2 and psbA-trnH through single-molecule real-time (SMRT) technology, thebiological composition of a classical prescription Sheng-Mai-San (SMS) was analyzed. At the sametime, clone-dependent Sanger sequencing was carried out as a parallel control. Further, anotherformula-Sanwei-Jili-San (SJS)-was analyzed with genes of ITS2 and CO1. All the ingredients inthe samples of SMS and SJS were successfully authenticated at the species level, and 11 exogenousspecies were also checked, some of which were considered as common contaminations in theseproducts. Methodology analysis demonstrated that this method was sensitive, accurate andreliable. FLMB, a superior but feasible approach for the identification of biological complexmixture, was established and elucidated, which shows perfect interpretation for DNA barcodingthat could lead its application in multi-species mixtures.
Expedited assessment of terrestrial arthropod diversity by coupling Malaise traps with DNA barcoding 1.
Monitoring changes in terrestrial arthropod communities over space and time requires a dramatic increase in the speed and accuracy of processing samples that cannot be achieved with morphological approaches. The combination of DNA barcoding and Malaise traps allows expedited, comprehensive inventories of species abundance whose cost will rapidly decline as high-throughput sequencing technologies advance. Aside from detailing protocols from specimen sorting to data release, this paper describes their use in a survey of arthropod diversity in a national park that examined 21?194 specimens representing 2255 species. These protocols can support arthropod monitoring programs at regional, national, and continental scales.
Plastid genomes from diverse glaucophyte genera reveal a largely conserved gene content and limited architectural diversity.
Plastid genome (ptDNA) data of Glaucophyta have been limited for many years to the genus Cyanophora. Here, we sequenced the ptDNAs of Gloeochaete wittrockiana, Cyanoptyche gloeocystis, Glaucocystis incrassata, and Glaucocystis sp. BBH. The reported sequences are the first genome-scale plastid data available for these three poorly studied glaucophyte genera. Although the Glaucophyta plastids appear morphologically “ancestral,” they actually bear derived genomes not radically different from those of red algae or viridiplants. The glaucophyte plastid coding capacity is highly conserved (112 genes shared) and the architecture of the plastid chromosomes is relatively simple. Phylogenomic analyses recovered Glaucophyta as the earliest diverging Archaeplastida lineage, but the position of viridiplants as the first branching group was not rejected by the approximately unbiased test. Pairwise distances estimated from 19 different plastid genes revealed that the highest sequence divergence between glaucophyte genera is frequently higher than distances between species of different classes within red algae or viridiplants. Gene synteny and sequence similarity in the ptDNAs of the two Glaucocystis species analyzed is conserved. However, the ptDNA of Gla. incrassata contains a 7.9-kb insertion not detected in Glaucocystis sp. BBH. The insertion contains ten open reading frames that include four coding regions similar to bacterial serine recombinases (two open reading frames), DNA primases, and peptidoglycan aminohydrolases. These three enzymes, often encoded in bacterial plasmids and bacteriophage genomes, are known to participate in the mobilization and replication of DNA mobile elements. It is therefore plausible that the insertion in Gla. incrassata ptDNA is derived from a DNA mobile element.
The wide implementation of next-generation sequencing (NGS) technologies has revolutionized the field of medical genetics. However, the short read lengths of currently used sequencing approaches pose a limitation for identification of structural variants, sequencing repetitive regions, phasing alleles and distinguishing highly homologous genomic regions. These limitations may significantly contribute to the diagnostic gap in patients with genetic disorders who have undergone standard NGS, like whole exome or even genome sequencing. Now, the emerging long-read sequencing (LRS) technologies may offer improvements in the characterization of genetic variation and regions that are difficult to assess with the currently prevailing NGS approaches. LRS has so far mainly been used to investigate genetic disorders with previously known or strongly suspected disease loci. While these targeted approaches already show the potential of LRS, it remains to be seen whether LRS technologies can soon enable true whole genome sequencing routinely. Ultimately, this could allow the de novo assembly of individual whole genomes used as a generic test for genetic disorders. In this article, we summarize the current LRS-based research on human genetic disorders and discuss the potential of these technologies to facilitate the next major advancements in medical genetics.
Birds are a group with immense availability of genomic resources, and hundreds of forthcoming genomes at the doorstep. We review recent developments in whole genome sequencing, phylogenomics, and comparative genomics of birds. Short read based genome assemblies are common, largely due to efforts of the Bird 10K genome project (B10K). Chromosome-level assemblies are expected to increase due to improved long-read sequencing. The available genomic data has enabled the reconstruction of the bird tree of life with increasing confidence and resolution, but challenges remain in the early splits of Neoaves due to their explosive diversification after the Cretaceous-Paleogene (K-Pg) event. Continued genomic sampling of the bird tree of life will not just better reflect their evolutionary history but also shine new light onto the organization of phylogenetic signal and conflict across the genome. The comparatively simple architecture of avian genomes makes them a powerful system to study the molecular foundation of bird specific traits. Birds are on the verge of becoming an extremely resourceful system to study biodiversity from the nucleotide up.
Comparative genomic and phylogenetic analyses of Populus section Leuce using complete chloroplast genome sequences
Species of Populus section Leuce are distributed throughout most parts of the Northern Hemisphere and have important economic and ecological significance. However, due to frequent hybridization within Leuce, the phylogenetic relationship between species has not been clarified. The chloroplast (cp) genome is characterized by maternal inheritance and relatively conservative mutation rates; thus, it is a powerful tool for building phylogenetic trees. In this study, we used the PacBio SEQUEL software to determine that the cp genome of Populus tomentosa has a length of 156,558 bp including a long single-copy region (84,717 bp), a small single-copy region (16,555 bp), and a pair of inverted repeat regions (27,643 bp). The cp genome contains 131 unique genes, including 37 transfer RNAs, 8 ribosomal RNAs, and 86 protein-coding genes. We compared the cp genomes of seven species of section Leuce and identified five cp DNA markers with >?1% variable sites. Phylogenetic analyses revealed two evolutionary branches for section Leuce. The species with the closest relationship with P. tomenstosa was P. adenopoda, followed by P. alba. These cp genome data will help to determine the cp evolution of section Leuce and further elucidate the origin of P. tomentosa.
Genome editing has proven to be highly potent in the generation of functional gene knockouts in dividing cells. In the CNS however, efficient technologies to repair sequences are yet to materialize. Reprogramming on the mRNA level is an attractive alternative as it provides means to perform in situ editing of coding sequences without nuclease dependency. Furthermore, de novo sequences can be inserted without the requirement of homologous recombination. Such reprogramming would enable efficient editing in quiescent cells (e.g., neurons) with an attractive safety profile for translational therapies. In this study, we applied a novel molecular-barcoded screening assay to investigate RNA trans-splicing in mammalian neurons. Through three alternative screening systems in cell culture and in vivo, we demonstrate that factors determining trans-splicing are reproducible regardless of the screening system. With this screening, we have located the most permissive trans-splicing sequences targeting an intron in the Synapsin I gene. Using viral vectors, we were able to splice full-length fluorophores into the mRNA while retaining very low off-target expression. Furthermore, this approach also showed evidence of functionality in the mouse striatum. However, in its current form, the trans-splicing events are stochastic and the overall activity lower than would be required for therapies targeting loss-of-function mutations. Nevertheless, the herein described barcode-based screening assay provides a unique possibility to screen and map large libraries in single animals or cell assays with very high precision.© 2018 Davidsson et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Second-generation, high-throughput sequencing methods have greatly improved our understanding of the ecology of soil microorganisms, yet the short barcodes (< 500 bp) provide limited taxonomic and phylogenetic information for species discrimination and taxonomic assignment. Here, we utilized the third-generation Pacific Biosciences (PacBio) RSII and Sequel instruments to evaluate the suitability of full-length internal transcribed spacer (ITS) barcodes and longer rRNA gene amplicons for metabarcoding Fungi, Oomycetes and other eukaryotes in soil samples. Metabarcoding revealed multiple errors and biases: Taq polymerase substitution errors and mis-incorporating indels in sequencing homopolymers constitute major errors; sequence length biases occur during PCR, library preparation, loading to the sequencing instrument and quality filtering; primer-template mismatches bias the taxonomic profile when using regular and highly degenerate primers. The RSII and Sequel platforms enable the sequencing of amplicons up to 3000 bp, but the sequence quality remains slightly inferior to Illumina sequencing especially in longer amplicons. The full ITS barcode and flanking rRNA small subunit gene greatly improve taxonomic identification at the species and phylum levels, respectively. We conclude that PacBio sequencing provides a viable alternative for metabarcoding of organisms that are of relatively low diversity, require > 500-bp barcode for reliable identification or when phylogenetic approaches are intended.© 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Increasing sorghum yields by seed treatment with an aqueous extract of the plant Eclipta alba may involve a dual mechanism of hydropriming and suppression of fungal pathogens
Background Soaking of sorghum seeds for six hours in an aqueous extract of Eclipta alba has been shown to increase the yield of sorghum in field experiments. The effect on yield is known to depend on field location and a mechanism involving pathogen suppression has been proposed. However, it has not been clear to which extent the same effect can be obtained by soaking of seeds in pure water (hydropriming). To address this question, fifty eight field tests were conducted comparing no treatment of seeds, hydropriming and treatment with plant extract. Experiments were distributed over three years in Burkina Faso on three locations previously showing a positive yield response to the plant extract. Results Despite strong variation across locations and years, a mean yield increase of 19.6% was found for hydropriming compared to no treatment (p?.018). For the plant extract, an additional yield increase of 32.1% was found (p?.016) corresponding to a total increase of 51.7%. In a subset of 15 experiments, a positive, but non-significant correlation was observed between the additional effect of the plant extract and the effect of a binary pesticide, Calthio C. Significantly, however, the E. alba extract reduced the number of seedlings infected by seed-borne filamentous fungi (p?.05). A reduction of infection by more than five-fold was found for the E. alba extract compared to hydropriming and included potential pathogens of sorghum: Epicoccum sorghinum and Curvularia spp. Conclusion Using 6-hours of soaking, hydropriming was an inherent component of seed treatment with the E. alba extract and contributed significantly to the overall observed increase of yield and emergence. An additional yield increase was caused by factor(s) derived from the plant, E. alba, and may involve suppression of pathogenic fungi.
Molecular characterization of eukaryotic algal communities in the tropical phyllosphere based on real-time sequencing of the 18S rDNA gene.
Foliicolous algae are a common occurrence in tropical forests. They are referable to a few simple morphotypes (unicellular, sarcinoid-like or filamentous), which makes their morphology of limited usefulness for taxonomic studies and species diversity assessments. The relationship between algal community and their host phyllosphere was not clear. In order to obtain a more accurate assessment, we used single molecule real-time sequencing of the 18S rDNA gene to characterize the eukaryotic algal community in an area of South-western China.We annotated 2922 OTUs belonging to five classes, Ulvophyceae, Trebouxiophyceae, Chlorophyceae, Dinophyceae and Eustigmatophyceae. Novel clades formed by large numbers sequences of green algae were detected in the order Trentepohliales (Ulvophyceae) and the Watanabea clade (Trebouxiophyceae), suggesting that these foliicolous communities may be substantially more diverse than so far appreciated and require further research. Species in Trentepohliales, Watanabea clade and Apatococcus clade were detected as the core members in the phyllosphere community studied. Communities from different host trees and sampling sites were not significantly different in terms of OTUs composition. However, the communities of Musa and Ravenala differed from other host plants significantly at the genus level, since they were dominated by Trebouxiophycean epiphytes.The cryptic diversity of eukaryotic algae especially Chlorophytes in tropical phyllosphere is very high. The community structure at species-level has no significant relationship either with host phyllosphere or locations. The core algal community in tropical phyllopshere is consisted of members from Trentepohliales, Watanabea clade and Apatococcus clade. Our study provided a large amount of novel 18S rDNA sequences that will be useful to unravel the cryptic diversity of phyllosphere eukaryotic algae and for comparisons with similar future studies on this type of communities.
Clinical PathoScope: rapid alignment and filtration for accurate pathogen identification in clinical samples using unassembled sequencing data.
The use of sequencing technologies to investigate the microbiome of a sample can positively impact patient healthcare by providing therapeutic targets for personalized disease treatment. However, these samples contain genomic sequences from various sources that complicate the identification of pathogens.Here we present Clinical PathoScope, a pipeline to rapidly and accurately remove host contamination, isolate microbial reads, and identify potential disease-causing pathogens. We have accomplished three essential tasks in the development of Clinical PathoScope. First, we developed an optimized framework for pathogen identification using a computational subtraction methodology in concordance with read trimming and ambiguous read reassignment. Second, we have demonstrated the ability of our approach to identify multiple pathogens in a single clinical sample, accurately identify pathogens at the subspecies level, and determine the nearest phylogenetic neighbor of novel or highly mutated pathogens using real clinical sequencing data. Finally, we have shown that Clinical PathoScope outperforms previously published pathogen identification methods with regard to computational speed, sensitivity, and specificity.Clinical PathoScope is the only pathogen identification method currently available that can identify multiple pathogens from mixed samples and distinguish between very closely related species and strains in samples with very few reads per pathogen. Furthermore, Clinical PathoScope does not rely on genome assembly and thus can more rapidly complete the analysis of a clinical sample when compared with current assembly-based methods. Clinical PathoScope is freely available at: http://sourceforge.net/projects/pathoscope/.
Community profiling of Fusarium in combination with other plant associated fungi in different crop species using SMRT Sequencing.
Fusarium head blight, caused by fungi from the genus Fusarium, is one of the most harmful cereal diseases, resulting not only in severe yield losses but also in mycotoxin contaminated and health-threatening grains. Fusarium head blight is caused by a diverse set of species that have different host ranges, mycotoxin profiles and responses to agricultural practices. Thus, understanding the composition of Fusarium communities in the field is crucial for estimating their impact and also for the development of effective control measures. Up to now, most molecular tools that monitor Fusarium communities on plants are limited to certain species and do not distinguish other plant associated fungi. To close these gaps, we developed a sequencing-based community profiling methodology for crop-associated fungi with a focus on the genus Fusarium. By analyzing a 1600 bp long amplicon spanning the highly variable segments ITS and D1-D3 of the ribosomal operon by PacBio SMRT sequencing, we were able to robustly quantify Fusarium down to species level through clustering against reference sequences. The newly developed methodology was successfully validated in mock communities and provided similar results as the culture-based assessment of Fusarium communities by seed health tests in grain samples from different crop species. Finally, we exemplified the newly developed methodology in a field experiment with a wheat-maize crop sequence under different cover crop and tillage regimes. We analyzed wheat straw residues, cover crop shoots and maize grains and we could reveal that the cover crop hairy vetch (Vicia villosa) acts as a potent alternative host for Fusarium (OTU F.ave/tri) showing an eightfold higher relative abundance compared with other cover crop treatments. Moreover, as the newly developed methodology also allows to trace other crop-associated fungi, we found that vetch and green fallow hosted further fungal plant pathogens including Zymoseptoria tritici. Thus, besides their beneficial traits, cover crops can also entail phytopathological risks by acting as alternative hosts for Fusarium and other noxious plant pathogens. The newly developed sequencing based methodology is a powerful diagnostic tool to trace Fusarium in combination with other fungi associated to different crop species.