Menu
July 19, 2019

Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species.

The fungal genus ofAspergillusis highly interesting, containing everything from industrial cell factories, model organisms, and human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverseAspergillusspecies (A. campestris,A. novofumigatus,A. ochraceoroseus, andA. steynii) have been whole-genome PacBio sequenced to provide genetic references in threeAspergillussections.A. taichungensisandA. candidusalso were sequenced for SM elucidation. ThirteenAspergillusgenomes were analyzed with comparative genomics to determine phylogeny and genetic diversity, showing that each presented genome contains 15-27% genes not found in other sequenced Aspergilli. In particular,A. novofumigatuswas compared with the pathogenic speciesA. fumigatusThis suggests thatA. novofumigatuscan produce most of the same allergens, virulence, and pathogenicity factors asA. fumigatus, suggesting thatA. novofumigatuscould be as pathogenic asA. fumigatusFurthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences, and predictive algorithms. We thus identify putative SM clusters for aflatoxin, chlorflavonin, and ochrindol inA. ochraceoroseus,A. campestris, andA. steynii, respectively, and novofumigatonin,ent-cycloechinulin, andepi-aszonalenins inA. novofumigatusOur study delivers six fungal genomes, showing the large diversity found in theAspergillusgenus; highlights the potential for discovery of beneficial or harmful SMs; and supports reports ofA. novofumigatuspathogenicity. It also shows how biological, biochemical, and genomic information can be combined to identify genes involved in the biosynthesis of specific SMs.


July 19, 2019

Structure and distribution of centromeric retrotransposons at diploid and allotetraploid Coffea centromeric and pericentromeric regions.

Centromeric regions of plants are generally composed of large array of satellites from a specific lineage ofGypsyLTR-retrotransposons, called Centromeric Retrotransposons. Repeated sequences interact with a specific H3 histone, playing a crucial function on kinetochore formation. To study the structure and composition of centromeric regions in the genusCoffea, we annotated and classified Centromeric Retrotransposons sequences from the allotetraploidC. arabicagenome and its two diploid ancestors:Coffea canephoraandC. eugenioides. Ten distinct CRC (Centromeric Retrotransposons inCoffea) families were found. The sequence mapping and FISH experiments of CRC Reverse Transcriptase domains inC. canephora, C. eugenioides, andC. arabicaclearly indicate a strong and specific targeting mainly onto proximal chromosome regions, which can be associated also with heterochromatin. PacBio genome sequence analyses of putative centromeric regions onC. arabicaandC. canephorachromosomes showed an exceptional density of one family of CRC elements, and the complete absence of satellite arrays, contrasting with usual structure of plant centromeres. Altogether, our data suggest a specific centromere organization inCoffea, contrasting with other plant genomes.


July 19, 2019

Single molecule real time sequencing in ADTKD-MUC1 allows complete assembly of the VNTR and exact positioning of causative mutations.

Recently, the Mucin-1 (MUC1) gene has been identified as a causal gene of autosomal dominant tubulointerstitial kidney disease (ADTKD). Most causative mutations are buried within a GC-rich 60 basepair variable number of tandem repeat (VNTR), which escapes identification by massive parallel sequencing methods due to the complexity of the VNTR. We established long read single molecule real time sequencing (SMRT) targeted to the MUC1-VNTR as an alternative strategy to the snapshot assay. Our approach allows complete VNTR assembly, thereby enabling the detection of all variants residing within the VNTR and simultaneous determination of VNTR length. We present high resolution data on the VNTR architecture for a cohort of snapshot positive (n?=?9) and negative (n?=?7) ADTKD families. By SMRT sequencing we could confirm the diagnosis in all previously tested cases, reconstruct both VNTR alleles and determine the exact position of the causative variant in eight of nine families. This study demonstrates that precise positioning of the causative mutation(s) and identification of other coding and noncoding sequence variants in ADTKD-MUC1 is feasible. SMRT sequencing could provide a powerful tool to uncover potential factors encoded within the VNTR that associate with intra- and interfamilial phenotype variability of MUC1 related kidney disease.


July 19, 2019

Phasevarions of bacterial pathogens: Methylomics sheds new light on old enemies.

A wide variety of bacterial pathogens express phase-variable DNA methyltransferases that control expression of multiple genes via epigenetic mechanisms. These randomly switching regulons – phasevarions – regulate genes involved in pathogenesis, host adaptation, and antibiotic resistance. Individual phase-variable genes can be identified in silico as they contain easily recognized features such as simple sequence repeats (SSRs) or inverted repeats (IRs) that mediate the random switching of expression. Conversely, phasevarion-controlled genes do not contain any easily identifiable features. The study of DNA methyltransferase specificity using Single-Molecule, Real-Time (SMRT) sequencing and methylome analysis has rapidly advanced the analysis of phasevarions by allowing methylomics to be combined with whole-transcriptome/proteome analysis to comprehensively characterize these systems in a number of important bacterial pathogens. Copyright © 2018 Elsevier Ltd. All rights reserved.


July 19, 2019

Resolving the complete genome of Kuenenia stuttgartiensis from a membrane bioreactor enrichment using Single-Molecule Real-Time sequencing.

Anaerobic ammonium-oxidizing (anammox) bacteria are a group of strictly anaerobic chemolithoautotrophic microorganisms. They are capable of oxidizing ammonium to nitrogen gas using nitrite as a terminal electron acceptor, thereby facilitating the release of fixed nitrogen into the atmosphere. The anammox process is thought to exert a profound impact on the global nitrogen cycle and has been harnessed as an environment-friendly method for nitrogen removal from wastewater. In this study, we present the first closed genome sequence of an anammox bacterium, Kuenenia stuttgartiensis MBR1. It was obtained through Single-Molecule Real-Time (SMRT) sequencing of an enrichment culture constituting a mixture of at least two highly similar Kuenenia strains. The genome of the novel MBR1 strain is different from the previously reported Kuenenia KUST reference genome as it contains numerous structural variations and unique genomic regions. We find new proteins, such as a type 3b (sulf)hydrogenase and an additional copy of the hydrazine synthase gene cluster. Moreover, multiple copies of ammonium transporters and proteins regulating nitrogen uptake were identified, suggesting functional differences in metabolism. This assembly, including the genome-wide methylation profile, provides a new foundation for comparative and functional studies aiming to elucidate the biochemical and metabolic processes of these organisms.


July 19, 2019

Coupling of single molecule, long read sequencing with IMGT/HighV-QUEST analysis expedites identification of SIV gp140-specific antibodies from scFv phage display libraries.

The simian immunodeficiency virus (SIV)/macaque model of human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome pathogenesis is critical for furthering our understanding of the role of antibody responses in the prevention of HIV infection, and will only increase in importance as macaque immunoglobulin (IG) gene databases are expanded. We have previously reported the construction of a phage display library from a SIV-infected rhesus macaque (Macaca mulatta) using oligonucleotide primers based on human IG gene sequences. Our previous screening relied on Sanger sequencing, which was inefficient and generated only a few dozen sequences. Here, we re-analyzed this library using single molecule, real-time (SMRT) sequencing on the Pacific Biosciences (PacBio) platform to generate thousands of highly accurate circular consensus sequencing (CCS) reads corresponding to full length single chain fragment variable. CCS data were then analyzed through the international ImMunoGeneTics information system®(IMGT®)/HighV-QUEST (www.imgt.org) to identify variable genes and perform statistical analyses. Overall the library was very diverse, with 2,569 different IMGT clonotypes called for the 5,238 IGHV sequences assigned to an IMGT clonotype. Within the library, SIV-specific antibodies represented a relatively limited number of clones, with only 135 different IMGT clonotypes called from 4,594 IGHV-assigned sequences. Our data did confirm that the IGHV4 and IGHV3 gene usage was the most abundant within the rhesus antibodies screened, and that these genes were even more enriched among SIV gp140-specific antibodies. Although a broad range of VH CDR3 amino acid (AA) lengths was observed in the unpanned library, the vast majority of SIV gp140-specific antibodies demonstrated a more uniform VH CDR3 length (20 AA). This uniformity was far less apparent when VH CDR3 were classified according to their clonotype (range: 9-25 AA), which we believe is more relevant for specific antibody identification. Only 174 IGKV and 588 IGLV clonotypes were identified within the VL sequences associated with SIV gp140-specific VH. Together, these data strongly suggest that the combination of SMRT sequencing with the IMGT/HighV-QUEST querying tool will facilitate and expedite our understanding of polyclonal antibody responses during SIV infection and may serve to rapidly expand the known scope of macaque V genes utilized during these responses.


July 19, 2019

Biomonitoring for traditional herbal medicinal products using DNA metabarcoding and single molecule, real-time sequencing.

Global concerns have been paid to the potential hazard of traditional herbal medicinal products (THMPs). Substandard and counterfeit THMPs, including traditional Chinese patent medicine, health foods, dietary supplements, etc. are potential threats to public health. Recent marketplace studies using DNA barcoding have determined that the current quality control methods are not sufficient for ensuring the presence of authentic herbal ingredients and detection of contaminants/adulterants. An efficient biomonitoring method for THMPs is of great needed. Herein, metabarcoding and single-molecule, real-time (SMRT) sequencing were used to detect the multiple ingredients in Jiuwei Qianghuo Wan (JWQHW), a classical herbal prescription widely used in China for the last 800 years. Reference experimental mixtures and commercial JWQHW products from the marketplace were used to confirm the method. Successful SMRT sequencing results recovered 5416 and 4342 circular-consensus sequencing (CCS) reads belonging to the ITS2 and psbA-trnH regions. The results suggest that with the combination of metabarcoding and SMRT sequencing, it is repeatable, reliable, and sensitive enough to detect species in the THMPs, and the error in SMRT sequencing did not affect the ability to identify multiple prescribed species and several adulterants/contaminants. It has the potential for becoming a valuable tool for the biomonitoring of multi-ingredient THMPs.


July 19, 2019

Piercing the dark matter: bioinformatics of long-range sequencing and mapping.

Several new genomics technologies have become available that offer long-read sequencing or long-range mapping with higher throughput and higher resolution analysis than ever before. These long-range technologies are rapidly advancing the field with improved reference genomes, more comprehensive variant identification and more complete views of transcriptomes and epigenomes. However, they also require new bioinformatics approaches to take full advantage of their unique characteristics while overcoming their complex errors and modalities. Here, we discuss several of the most important applications of the new technologies, focusing on both the currently available bioinformatics tools and opportunities for future research.


July 19, 2019

Herbivorous turtle ants obtain essential nutrients from a conserved nitrogen-recycling gut microbiome.

Nitrogen acquisition is a major challenge for herbivorous animals, and the repeated origins of herbivory across the ants have raised expectations that nutritional symbionts have shaped their diversification. Direct evidence for N provisioning by internally housed symbionts is rare in animals; among the ants, it has been documented for just one lineage. In this study we dissect functional contributions by bacteria from a conserved, multi-partite gut symbiosis in herbivorous Cephalotes ants through in vivo experiments, metagenomics, and in vitro assays. Gut bacteria recycle urea, and likely uric acid, using recycled N to synthesize essential amino acids that are acquired by hosts in substantial quantities. Specialized core symbionts of 17 studied Cephalotes species encode the pathways directing these activities, and several recycle N in vitro. These findings point to a highly efficient N economy, and a nutritional mutualism preserved for millions of years through the derived behaviors and gut anatomy of Cephalotes ants.


July 19, 2019

The genome of Schmidtea mediterranea and the evolution of core cellular mechanisms.

The planarian Schmidtea mediterranea is an important model for stem cell research and regeneration, but adequate genome resources for this species have been lacking. Here we report a highly contiguous genome assembly of S. mediterranea, using long-read sequencing and a de novo assembler (MARVEL) enhanced for low-complexity reads. The S. mediterranea genome is highly polymorphic and repetitive, and harbours a novel class of giant retroelements. Furthermore, the genome assembly lacks a number of highly conserved genes, including critical components of the mitotic spindle assembly checkpoint, but planarians maintain checkpoint function. Our genome assembly provides a key model system resource that will be useful for studying regeneration and the evolutionary plasticity of core cell biological mechanisms.


July 19, 2019

Survey on the use of whole-genome sequencing for infectious diseases surveillance: Rapid expansion of European national capacities, 2015-2016.

Whole-genome sequencing (WGS) has become an essential tool for public health surveillance and molecular epidemiology of infectious diseases and antimicrobial drug resistance. It provides precise geographical delineation of spread and enables incidence monitoring of pathogens at genotype level. Coupled with epidemiological and environmental investigations, it delivers ultimate resolution for tracing sources of epidemic infections. To ascertain the level of implementation of WGS-based typing for national public health surveillance and investigation of prioritized diseases in the European Union (EU)/European Economic Area (EEA), two surveys were conducted in 2015 and 2016. The surveys were designed to determine the national public health reference laboratories’ access to WGS and operational WGS-based typing capacity for national surveillance of selected foodborne pathogens, antimicrobial-resistant pathogens, and vaccine-preventable diseases identified as priorities for European genomic surveillance. Twenty-eight and twenty-nine out of the 30 EU/EEA countries participated in the survey in 2015 and 2016, respectively. National public health reference laboratories in 22 and 25 countries had access to WGS-based typing for public health applications in 2015 and 2016, respectively. Reported reasons for limited or no access were lack of funding, staff, and expertise. Illumina technology was the most frequently used followed by Ion Torrent technology. The access to bioinformatics expertise and competence for routine WGS data analysis was limited. By mid-2016, half of the EU/EEA countries were using WGS analysis either as first- or second-line typing method for surveillance of the pathogens and antibiotic resistance issues identified as EU priorities. The sampling frame as well as bioinformatics analysis varied by pathogen/resistance issue and country. Core genome multilocus allelic profiling, also called cgMLST, was the most frequently used annotation approach for typing bacterial genomes suggesting potential bioinformatics pipeline compatibility. Further capacity development for WGS-based typing is ongoing in many countries and upon consolidation and harmonization of methods should enable pan-EU data exchange for genomic surveillance in the medium-term subject to the development of suitable data management systems and appropriate agreements for data sharing.


July 19, 2019

Genomic repeats, misassembly and reannotation: a case study with long-read resequencing of Porphyromonas gingivalis reference strains.

Without knowledge of their genomic sequences, it is impossible to make functional models of the bacteria that make up human and animal microbiota. Unfortunately, the vast majority of publicly available genomes are only working drafts, an incompleteness that causes numerous problems and constitutes a major obstacle to genotypic and phenotypic interpretation. In this work, we began with an example from the class Bacteroidia in the phylum Bacteroidetes, which is preponderant among human orodigestive microbiota. We successfully identify the genetic loci responsible for assembly breaks and misassemblies and demonstrate the importance and usefulness of long-read sequencing and curated reannotation.We showed that the fragmentation in Bacteroidia draft genomes assembled from massively parallel sequencing linearly correlates with genomic repeats of the same or greater size than the reads. We also demonstrated that some of these repeats, especially the long ones, correspond to misassembled loci in three reference Porphyromonas gingivalis genomes marked as circularized (thus complete or finished). We prove that even at modest coverage (30X), long-read resequencing together with PCR contiguity verification (rrn operons and an integrative and conjugative element or ICE) can be used to identify and correct the wrongly combined or assembled regions. Finally, although time-consuming and labor-intensive, consistent manual biocuration of three P. gingivalis strains allowed us to compare and correct the existing genomic annotations, resulting in a more accurate interpretation of the genomic differences among these strains.In this study, we demonstrate the usefulness and importance of long-read sequencing in verifying published genomes (even when complete) and generating assemblies for new bacterial strains/species with high genomic plasticity. We also show that when combined with biological validation processes and diligent biocurated annotation, this strategy helps reduce the propagation of errors in shared databases, thus limiting false conclusions based on incomplete or misleading information.


July 19, 2019

The complete and fully assembled genome sequence of Aeromonas salmonicida subsp. pectinolytica and its comparative analysis with other Aeromonas species: investigation of the mobilome in environmental and pathogenic strains.

Due to the predominant usage of short-read sequencing to date, most bacterial genome sequences reported in the last years remain at the draft level. This precludes certain types of analyses, such as the in-depth analysis of genome plasticity.Here we report the finalized genome sequence of the environmental strain Aeromonas salmonicida subsp. pectinolytica 34mel, for which only a draft genome with 253 contigs is currently available. Successful completion of the transposon-rich genome critically depended on the PacBio long read sequencing technology. Using finalized genome sequences of A. salmonicida subsp. pectinolytica and other Aeromonads, we report the detailed analysis of the transposon composition of these bacterial species. Mobilome evolution is exemplified by a complex transposon, which has shifted from pathogenicity-related to environmental-related gene content in A. salmonicida subsp. pectinolytica 34mel.Obtaining the complete, circular genome of A. salmonicida subsp. pectinolytica allowed us to perform an in-depth analysis of its mobilome. We demonstrate the mobilome-dependent evolution of this strain’s genetic profile from pathogenic to environmental.


July 19, 2019

Extreme sensitivity to ultraviolet light in the fungal pathogen causing white-nose syndrome of bats.

Bat white-nose syndrome (WNS), caused by the fungal pathogen Pseudogymnoascus destructans, has decimated North American hibernating bats since its emergence in 2006. Here, we utilize comparative genomics to examine the evolutionary history of this pathogen in comparison to six closely related nonpathogenic species. P. destructans displays a large reduction in carbohydrate-utilizing enzymes (CAZymes) and in the predicted secretome (~50%), and an increase in lineage-specific genes. The pathogen has lost a key enzyme, UVE1, in the alternate excision repair (AER) pathway, which is known to contribute to repair of DNA lesions induced by ultraviolet (UV) light. Consistent with a nonfunctional AER pathway, P. destructans is extremely sensitive to UV light, as well as the DNA alkylating agent methyl methanesulfonate (MMS). The differential susceptibility of P. destructans to UV light in comparison to other hibernacula-inhabiting fungi represents a potential “Achilles’ heel” of P. destructans that might be exploited for treatment of bats with WNS.


July 19, 2019

Long-read sequence assembly of the firefly Pyrocoelia pectoralis genome.

Fireflies are a family of insects within the beetle order Coleoptera, or winged beetles, and they are one of the most well-known and loved insect species because of their bioluminescence. However, the firefly is in danger of extinction because of the massive destruction of its living environment. In order to improve the understanding of fireflies and protect them effectively, we sequenced the whole genome of the terrestrial firefly Pyrocoelia pectoralis.Here, we developed a highly reliable genome resource for the terrestrial firefly Pyrocoelia pectoralis (E. Oliv., 1883; Coleoptera: Lampyridae) using single molecule real time (SMRT) sequencing on the PacBio Sequel platform. In total, 57.8 Gb of long reads were generated and assembled into a 760.4-Mb genome, which is close to the estimated genome size and covered 98.7% complete and 0.7% partial insect Benchmarking Universal Single-Copy Orthologs. The k-mer analysis showed that this genome is highly heterozygous. However, our long-read assembly demonstrates continuousness with a contig N50 length of 3.04 Mb and the longest contig length of 13.69 Mb. Furthermore, 135 589 SSRs and 341 Mb of repeat sequences were detected. A total of 23 092 genes were predicted; 88.44% of genes were annotated with one or more related functions.We assembled a high-quality firefly genome, which will not only provide insights into the conservation and biodiversity of fireflies, but also provide a wealth of information to study the mechanisms of their sexual communication, bio-luminescence, and evolution.© The Authors 2017. Published by Oxford University Press.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.