This tutorial provides an overview of the Circular Consensus Sequence (CCS) analysis application. The CCS algorithm is used in applications that require distinguishing closely related DNA molecules in the same…
Over the past decade, RNA sequencing (RNA-seq) has become an indispensable tool for transcriptome-wide analysis of differential gene expression and differential splicing of mRNAs. However, as next-generation sequencing technologies have developed, so too has RNA-seq. Now, RNA-seq methods are available for studying many different aspects of RNA biology, including single-cell gene expression, translation (the translatome) and RNA structure (the structurome). Exciting new applications are being explored, such as spatial transcriptomics (spatialomics). Together with new long-read and direct RNA-seq technologies and better computational tools for data analysis, innovations in RNA-seq are contributing to a fuller understanding of RNA biology, from questions such as when and where transcription occurs to the folding and intermolecular interactions that govern RNA function.
Development of high-throughput sequencing techniques have greatly benefited our understanding about microbial ecology; yet the methods producing short reads suffer from species-level resolution and uncertainty of identification. Here we optimize PacBio-based metabarcoding protocols covering the Internal Transcribed Spacer (ITS region) and partial Small Subunit (SSU) of the rRNA gene for species-level identification of all eukaryotes, with a specific focus on Fungi (including Glomeromycota) and Stramenopila (particularly Oomycota). Based on tests on composite soil samples and mock communities, we propose best suitable degenerate primers, ITS9munngs + ITS4ngsUni for eukaryotes and selected groups therein and discuss pros and cons of long read-based identification of eukaryotes. This article is protected by copyright. All rights reserved.
Relative Performance of MinION (Oxford Nanopore Technologies) versus Sequel (Pacific Biosciences) Third-Generation Sequencing Instruments in Identification of Agricultural and Forest Fungal Pathogens.
Culture-based molecular identification methods have revolutionized detection of pathogens, yet these methods are slow and may yield inconclusive results from environmental materials. The second-generation sequencing tools have much-improved precision and sensitivity of detection, but these analyses are costly and may take several days to months. Of the third-generation sequencing techniques, the portable MinION device (Oxford Nanopore Technologies) has received much attention because of its small size and possibility of rapid analysis at reasonable cost. Here, we compare the relative performances of two third-generation sequencing instruments, MinION and Sequel (Pacific Biosciences), in identification and diagnostics of fungal and oomycete pathogens from conifer (Pinaceae) needles and potato (Solanum tuberosum) leaves and tubers. We demonstrate that the Sequel instrument is efficient for metabarcoding of complex samples, whereas MinION is not suited for this purpose due to a high error rate and multiple biases. However, we find that MinION can be utilized for rapid and accurate identification of dominant pathogenic organisms and other associated organisms from plant tissues following both amplicon-based and PCR-free metagenomics approaches. Using the metagenomics approach with shortened DNA extraction and incubation times, we performed the entire MinION workflow, from sample preparation through DNA extraction, sequencing, bioinformatics, and interpretation, in 2.5 h. We advocate the use of MinION for rapid diagnostics of pathogens and potentially other organisms, but care needs to be taken to control or account for multiple potential technical biases.IMPORTANCE Microbial pathogens cause enormous losses to agriculture and forestry, but current combined culturing- and molecular identification-based detection methods are too slow for rapid identification and application of countermeasures. Here, we develop new and rapid protocols for Oxford Nanopore MinION-based third-generation diagnostics of plant pathogens that greatly improve the speed of diagnostics. However, due to high error rate and technical biases in MinION, the Pacific BioSciences Sequel platform is more useful for in-depth amplicon-based biodiversity monitoring (metabarcoding) from complex environmental samples.Copyright © 2019 American Society for Microbiology.
Supernumerary B chromosomes (Bs) are extra karyotype units in addition to A chromosomes, and are found in some fungi and thousands of animals and plant species. Bs are uniquely characterized due to their non-Mendelian inheritance, and represent one of the best examples of genomic conflict. Over the last decades, their genetic composition, function and evolution have remained an unresolved query, although a few successful attempts have been made to address these phenomena. A classical concept based on cytogenetics and genetics is that Bs are selfish and abundant with DNA repeats and transposons, and in most cases, they do not carry any function. However, recently, the modern quantum development of high scale multi-omics techniques has shifted B research towards a new-born field that we call “B-omics”. We review the recent literature and add novel perspectives to the B research, discussing the role of new technologies to understand the mechanistic perspectives of the molecular evolution and function of Bs. The modern view states that B chromosomes are enriched with genes for many significant biological functions, including but not limited to the interesting set of genes related to cell cycle and chromosome structure. Furthermore, the presence of B chromosomes could favor genomic rearrangements and influence the nuclear environment affecting the function of other chromatin regions. We hypothesize that B chromosomes might play a key function in driving their transmission and maintenance inside the cell, as well as offer an extra genomic compartment for evolution.
Gammaherpesvirus Readthrough Transcription Generates a Long Non-Coding RNA That Is Regulated by Antisense miRNAs and Correlates with Enhanced Lytic Replication In Vivo.
Gammaherpesviruses, including the human pathogens Epstein?Barr virus (EBV) and Kaposi’s sarcoma-associated herpesvirus (KSHV) are oncogenic viruses that establish lifelong infections in hosts and are associated with the development of lymphoproliferative diseases and lymphomas. Recent studies have shown that the majority of the mammalian genome is transcribed and gives rise to numerous long non-coding RNAs (lncRNAs). Likewise, the large double-stranded DNA virus genomes of herpesviruses undergo pervasive transcription, including the expression of many as yet uncharacterized lncRNAs. Murine gammaperherpesvirus 68 (MHV68, MuHV-4, ?HV68) is a natural pathogen of rodents, and is genetically and pathogenically related to EBV and KSHV, providing a highly tractable model for studies of gammaherpesvirus biology and pathogenesis. Through the integrated use of parallel data sets from multiple sequencing platforms, we previously resolved transcripts throughout the MHV68 genome, including at least 144 novel transcript isoforms. Here, we sought to molecularly validate novel transcripts identified within the M3/M2 locus, which harbors genes that code for the chemokine binding protein M3, the latency B cell signaling protein M2, and 10 microRNAs (miRNAs). Using strand-specific northern blots, we validated the presence of M3-04, a 3.91 kb polyadenylated transcript that initiates at the M3 transcription start site and reads through the M3 open reading frame (ORF), the M3 poly(a) signal sequence, and the M2 ORF. This unexpected transcript was solely localized to the nucleus, strongly suggesting that it is not translated and instead may function as a lncRNA. Use of an MHV68 mutant lacking two M3-04-antisense pre-miRNA stem loops resulted in highly increased expression of M3-04 and increased virus replication in the lungs of infected mice, demonstrating a key role for these RNAs in regulation of lytic infection. Together these findings suggest the possibility of a tripartite regulatory relationship between the lncRNA M3-04, antisense miRNAs, and the latency gene M2.
Chromulinavorax destructans, a pathogen of microzooplankton that provides a window into the enigmatic candidate phylum Dependentiae.
Members of the major candidate phylum Dependentiae (a.k.a. TM6) are widespread across diverse environments from showerheads to peat bogs; yet, with the exception of two isolates infecting amoebae, they are only known from metagenomic data. The limited knowledge of their biology indicates that they have a long evolutionary history of parasitism. Here, we present Chromulinavorax destructans (Strain SeV1) the first isolate of this phylum to infect a representative from a widespread and ecologically significant group of heterotrophic flagellates, the microzooplankter Spumella elongata (Strain CCAP 955/1). Chromulinavorax destructans has a reduced 1.2 Mb genome that is so specialized for infection that it shows no evidence of complete metabolic pathways, but encodes an extensive transporter system for importing nutrients and energy in the form of ATP from the host. Its replication causes extensive reorganization and expansion of the mitochondrion, effectively surrounding the pathogen, consistent with its dependency on the host for energy. Nearly half (44%) of the inferred proteins contain signal sequences for secretion, including many without recognizable similarity to proteins of known function, as well as 98 copies of proteins with an ankyrin-repeat domain; ankyrin-repeats are known effectors of host modulation, suggesting the presence of an extensive host-manipulation apparatus. These observations help to cement members of this phylum as widespread and diverse parasites infecting a broad range of eukaryotic microbes.
The Genome of Armadillidium vulgare (Crustacea, Isopoda) Provides Insights into Sex Chromosome Evolution in the Context of Cytoplasmic Sex Determination.
The terrestrial isopod Armadillidium vulgare is an original model to study the evolution of sex determination and symbiosis in animals. Its sex can be determined by ZW sex chromosomes, or by feminizing Wolbachia bacterial endosymbionts. Here, we report the sequence and analysis of the ZW female genome of A. vulgare. A distinguishing feature of the 1.72 gigabase assembly is the abundance of repeats (68% of the genome). We show that the Z and W sex chromosomes are essentially undifferentiated at the molecular level and the W-specific region is extremely small (at most several hundreds of kilobases). Our results suggest that recombination suppression has not spread very far from the sex-determining locus, if at all. This is consistent with A. vulgare possessing evolutionarily young sex chromosomes. We characterized multiple Wolbachia nuclear inserts in the A. vulgare genome, none of which is associated with the W-specific region. We also identified several candidate genes that may be involved in the sex determination or sexual differentiation pathways. The A. vulgare genome serves as a resource for studying the biology and evolution of crustaceans, one of the most speciose and emblematic metazoan groups. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Newly designed 16S rRNA metabarcoding primers amplify diverse and novel archaeal taxa from the environment.
High-throughput studies of microbial communities suggest that Archaea are a widespread component of microbial diversity in various ecosystems. However, proper quantification of archaeal diversity and community ecology remains limited, as sequence coverage of Archaea is usually low owing to the inability of available prokaryotic primers to efficiently amplify archaeal compared to bacterial rRNA genes. To improve identification and quantification of Archaea, we designed and validated the utility of several primer pairs to efficiently amplify archaeal 16S rRNA genes based on up-to-date reference genes. We demonstrate that several of these primer pairs amplify phylogenetically diverse Archaea with high sequencing coverage, outperforming commonly used primers. Based on comparing the resulting long 16S rRNA gene fragments with public databases from all habitats, we found several novel family- to phylum-level archaeal taxa from topsoil and surface water. Our results suggest that archaeal diversity has been largely overlooked due to the limitations of available primers, and that improved primer pairs enable to estimate archaeal diversity more accurately. © 2018 The Authors. Environmental Microbiology Reports published by Society for Applied Microbiology and John Wiley & Sons Ltd.
A systematic review of the Trypanosoma cruzi genetic heterogeneity, host immune response and genetic factors as plausible drivers of chronic chagasic cardiomyopathy.
Chagas disease is a complex tropical pathology caused by the kinetoplastid Trypanosoma cruzi. This parasite displays massive genetic diversity and has been classified by international consensus in at least six Discrete Typing Units (DTUs) that are broadly distributed in the American continent. The main clinical manifestation of the disease is the chronic chagasic cardiomyopathy (CCC) that is lethal in the infected individuals. However, one intriguing feature is that only 30-40% of the infected individuals will develop CCC. Some authors have suggested that the immune response, host genetic factors, virulence factors and even the massive genetic heterogeneity of T. cruzi are responsible of this clinical pattern. To date, no conclusive data support the reason why a few percentages of the infected individuals will develop CCC. Therefore, we decided to conduct a systematic review analysing the host genetic factors, immune response, cytokine production, virulence factors and the plausible association of the parasite DTUs and CCC. The epidemiological and clinical implications are herein discussed.
Confident phylogenetic identification of uncultured prokaryotes through long read amplicon sequencing of the 16S-ITS-23S rRNA operon.
Amplicon sequencing of the 16S rRNA gene is the predominant method to quantify microbial compositions and to discover novel lineages. However, traditional short amplicons often do not contain enough information to confidently resolve their phylogeny. Here we present a cost-effective protocol that amplifies a large part of the rRNA operon and sequences the amplicons with PacBio technology. We tested our method on a mock community and developed a read-curation pipeline that reduces the overall read error rate to 0.18%. Applying our method on four environmental samples, we captured near full-length rRNA operon amplicons from a large diversity of prokaryotes. The method operated at moderately high-throughput (22286-37,850 raw ccs reads) and generated a large amount of putative novel archaeal 23S rRNA gene sequences compared to the archaeal SILVA database. These long amplicons allowed for higher resolution during taxonomic classification by means of long (~1000 bp) 16S rRNA gene fragments and for substantially more confident phylogenies by means of combined near full-length 16S and 23S rRNA gene sequences, compared to shorter traditional amplicons (250 bp of the 16S rRNA gene). We recommend our method to those who wish to cost-effectively and confidently estimate the phylogenetic diversity of prokaryotes in environmental samples at high throughput. © 2019 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.
Development of a Molecular Marker Linked to the A4 Locus and the Structure of HD Genes in Pleurotus eryngii
Allelic differences in A and B mating-type loci are a prerequisite for the progression of mating in the genus Pleurotus eryngii; thus, the crossing is hampered by this biological barrier in inbreeding. Molecular markers linked to mating types of P. eryngii KNR2312 were investigated with randomly amplified polymorphic DNA to enhance crossing efficiency. An A4-linked sequence was identified and used to find the adjacent genomic region with the entire motif of the A locus from a contig sequenced by PacBio. The sequence-characterized amplified region marker 7-2299 distinguished A4 mating-type monokaryons from KNR2312 and other strains. A BLAST search of flanked sequences revealed that the A4 locus had a general feature consisting of the putative HD1 and HD2 genes. Both putative HD transcription factors contain a homeodomain sequence and a nuclear localization sequence; however, valid dimerization motifs were found only in the HD1 protein. The ACAAT motif, which was reported to have relevance to sex determination, was found in the intergenic region. The SCAR marker could be applicable in the classification of mating types in the P. eryngii breeding program, and the A4 locus could be the basis for a multi-allele detection marker.
Assessment of the microbial diversity of Chinese Tianshan tibicos by single molecule, real-time sequencing technology.
Chinese Tianshan tibico grains were collected from the rural area of Tianshan in Xinjiang province, China. Typical tibico grains are known to consist of polysaccharide matrix that embeds a variety of bacteria and yeasts. These grains are widely used in some rural regions to produce a beneficial sugary beverage that is slightly acidic and contains low level of alcohol. This work aimed to characterize the microbiota composition of Chinese Tianshan tibicos using the single molecule, real-time sequencing technology, which is advantageous in generating long reads. Our results revealed that the microbiota mainly comprised of the bacterial species of Lactobacillus hilgardii, Lactococcus raffinolactis, Leuconostoc mesenteroides, Zymomonas mobilis, together with a Guehomyces pullulans-dominating fungal community. The data generated in this work helps identify beneficial microbes in Chinese Tianshan tibico grains.
Insights into the evolution and drug susceptibility of Babesia duncani from the sequence of its mitochondrial and apicoplast genomes.
Babesia microti and Babesia duncani are the main causative agents of human babesiosis in the United States. While significant knowledge about B. microti has been gained over the past few years, nothing is known about B. duncani biology, pathogenesis, mode of transmission or sensitivity to currently recommended therapies. Studies in immunocompetent wild type mice and hamsters have shown that unlike B. microti, infection with B. duncani results in severe pathology and ultimately death. The parasite factors involved in B. duncani virulence remain unknown. Here we report the first known completed sequence and annotation of the apicoplast and mitochondrial genomes of B. duncani. We found that the apicoplast genome of this parasite consists of a 34?kb monocistronic circular molecule encoding functions that are important for apicoplast gene transcription as well as translation and maturation of the organelle’s proteins. The mitochondrial genome of B. duncani consists of a 5.9?kb monocistronic linear molecule with two inverted repeats of 48?bp at both ends. Using the conserved cytochrome b (Cytb) and cytochrome c oxidase subunit I (coxI) proteins encoded by the mitochondrial genome, phylogenetic analysis revealed that B. duncani defines a new lineage among apicomplexan parasites distinct from B. microti, Babesia bovis, Theileria spp. and Plasmodium spp. Annotation of the apicoplast and mitochondrial genomes of B. duncani identified targets for development of effective therapies. Our studies set the stage for evaluation of the efficacy of these drugs alone or in combination against B. duncani in culture as well as in animal models.Copyright © 2018 Australian Society for Parasitology. Published by Elsevier Ltd. All rights reserved.
Despite the conserved essential function of centromeres, centromeric DNA itself is not conserved. The histone-H3 variant, CENP-A, is the epigenetic mark that specifies centromere identity. Paradoxically, CENP-A normally assembles on particular sequences at specific genomic locations. To gain insight into the specification of complex centromeres, here we take an evolutionary approach, fully assembling genomes and centromeres of related fission yeasts. Centromere domain organization, but not sequence, is conserved between Schizosaccharomyces pombe, S. octosporus and S. cryophilus with a central CENP-ACnp1 domain flanked by heterochromatic outer-repeat regions. Conserved syntenic clusters of tRNA genes and 5S rRNA genes occur across the centromeres of S. octosporus and S. cryophilus, suggesting conserved function. Interestingly, nonhomologous centromere central-core sequences from S. octosporus and S. cryophilus are recognized in S. pombe, resulting in cross-species establishment of CENP-ACnp1 chromatin and functional kinetochores. Therefore, despite the lack of sequence conservation, Schizosaccharomyces centromere DNA possesses intrinsic conserved properties that promote assembly of CENP-A chromatin.