April 21, 2020  |  

A chromosomal-level genome assembly for the insect vector for Chagas disease, Triatoma rubrofasciata.

Triatoma rubrofasciata is a widespread pathogen vector for Chagas disease, an illness that affects approximately 7 million people worldwide. Despite its importance to human health, its evolutionary origin has not been conclusively determined. A reference genome for T. rubrofasciata is not yet available.We have sequenced the genome of a female individual with T. rubrofasciatausing a single molecular DNA sequencing technology (i.e., PacBio Sequel platform) and have successfully reconstructed a whole-genome (680-Mb) assembly that covers 90% of the nuclear genome (757 Mb). Through Hi-C analysis, we have reconstructed full-length chromosomes of this female individual that has 13 unique chromosomes (2n = 24 = 22 + X1 + X2) with a contig N50 of 2.72 Mb and a scaffold N50 of 50.7 Mb. This genome has achieved a high base-level accuracy of 99.99%. This platinum-grade genome assembly has 12,691 annotated protein-coding genes. More than 95.1% of BUSCO genes were single-copy completed, indicating a high level of completeness of the genome.The platinum-grade genome assembly and its annotation provide valuable information for future in-depth comparative genomics studies, including sexual determination analysis in T. rubrofasciata and the pathogenesis of Chagas disease. © The Author(s) 2019. Published by Oxford University Press.


April 21, 2020  |  

Genomic and Functional Analysis of Emerging Virulent and Multidrug-Resistant Escherichia coli Lineage Sequence Type 648.

The pathogenic extended-spectrum-beta-lactamase (ESBL)-producing Escherichia coli lineage ST648 is increasingly reported from multiple origins. Our study of a large and global ST648 collection from various hosts (87 whole-genome sequences) combining core and accessory genomics with functional analyses and in vivo experiments suggests that ST648 is a nascent and generalist lineage, lacking clear phylogeographic and host association signals. By including large numbers of ST131 (n?=?107) and ST10 (n?=?96) strains for comparative genomics and phenotypic analysis, we demonstrate that the combination of multidrug resistance and high-level virulence are the hallmarks of ST648, similar to international high-risk clonal lineage ST131. Specifically, our in silico, in vitro, and in vivo results demonstrate that ST648 is well equipped with biofilm-associated features, while ST131 shows sophisticated signatures indicative of adaption to urinary tract infection, potentially conveying individual ecological niche adaptation. In addition, we used a recently developed NFDS (negative frequency-dependent selection) population model suggesting that ST648 will increase significantly in frequency as a cause of bacteremia within the next few years. Also, ESBL plasmids impacting biofilm formation aided in shaping and maintaining ST648 strains to successfully emerge worldwide across different ecologies. Our study contributes to understanding what factors drive the evolution and spread of emerging international high-risk clonal lineages.Copyright © 2019 American Society for Microbiology.


April 21, 2020  |  

The genomes of pecan and Chinese hickory provide insights into Carya evolution and nut nutrition.

Pecan (Carya illinoinensis) and Chinese hickory (C. cathayensis) are important commercially cultivated nut trees in the genus Carya (Juglandaceae), with high nutritional value and substantial health benefits.We obtained >187.22 and 178.87 gigabases of sequence, and ~288× and 248× genome coverage, to a pecan cultivar (“Pawnee”) and a domesticated Chinese hickory landrace (ZAFU-1), respectively. The total assembly size is 651.31 megabases (Mb) for pecan and 706.43 Mb for Chinese hickory. Two genome duplication events before the divergence from walnut were found in these species. Gene family analysis highlighted key genes in biotic and abiotic tolerance, oil, polyphenols, essential amino acids, and B vitamins. Further analyses of reduced-coverage genome sequences of 16 Carya and 2 Juglans species provide additional phylogenetic perspective on crop wild relatives.Cooperative characterization of these valuable resources provides a window to their evolutionary development and a valuable foundation for future crop improvement. © The Author(s) 2019. Published by Oxford University Press.


April 21, 2020  |  

Harnessing long-read amplicon sequencing to uncover NRPS and Type I PKS gene sequence diversity in polar desert soils.

The severity of environmental conditions at Earth’s frigid zones present attractive opportunities for microbial biomining due to their heightened potential as reservoirs for novel secondary metabolites. Arid soil microbiomes within the Antarctic and Arctic circles are remarkably rich in Actinobacteria and Proteobacteria, bacterial phyla known to be prolific producers of natural products. Yet the diversity of secondary metabolite genes within these cold, extreme environments remain largely unknown. Here, we employed amplicon sequencing using PacBio RS II, a third generation long-read platform, to survey over 200 soils spanning twelve east Antarctic and high Arctic sites for natural product-encoding genes, specifically targeting non-ribosomal peptides (NRPS) and Type I polyketides (PKS). NRPS-encoding genes were more widespread across the Antarctic, whereas PKS genes were only recoverable from a handful of sites. Many recovered sequences were deemed novel due to their low amino acid sequence similarity to known protein sequences, particularly throughout the east Antarctic sites. Phylogenetic analysis revealed that a high proportion were most similar to antifungal and biosurfactant-type clusters. Multivariate analysis showed that soil fertility factors of carbon, nitrogen and moisture displayed significant negative relationships with natural product gene richness. Our combined results suggest that secondary metabolite production is likely to play an important physiological component of survival for microorganisms inhabiting arid, nutrient-starved soils. © FEMS 2019.


April 21, 2020  |  

Detection of VIM-1-Producing Enterobacter cloacae and Salmonella enterica Serovars Infantis and Goldcoast at a Breeding Pig Farm in Germany in 2017 and Their Molecular Relationship to Former VIM-1-Producing S. Infantis Isolates in German Livestock Production.

In 2011, VIM-1-producing Salmonella enterica serovar Infantis and Escherichia coli were isolated for the first time in four German livestock farms. In 2015/2016, highly related isolates were identified in German pig production. This raised the issue of potential reservoirs for these isolates, the relation of their mobile genetic elements, and potential links between the different affected farms/facilities. In a piglet-producing farm suspicious for being linked to some blaVIM-1 findings in Germany, fecal and environmental samples were examined for the presence of carbapenemase-producing Enterobacteriaceae and Salmonella spp. Newly discovered isolates were subjected to Illumina whole-genome sequencing (WGS) and S1 pulsed-field gel electrophoresis (PFGE) hybridization experiments. WGS data of these isolates were compared with those for the previously isolated VIM-1-producing Salmonella Infantis isolates from pigs and poultry. Among 103 samples, one Salmonella Goldcoast isolate, one Salmonella Infantis isolate, and one Enterobacter cloacae isolate carrying the blaVIM-1 gene were detected. Comparative WGS analysis revealed that the blaVIM-1 gene was part of a particular Tn21-like transposable element in all isolates. It was located on IncHI2 (ST1) plasmids of ~290 to 300?kb with a backbone highly similar (98 to 100%) to that of reference pSE15-SA01028. SNP analysis revealed a close relationship of all VIM-1-positive S Infantis isolates described since 2011. The findings of this study demonstrate that the occurrence of the blaVIM-1 gene in German livestock is restricted neither to a certain bacterial species nor to a certain Salmonella serovar but is linked to a particular Tn21-like transposable element located on transferable pSE15-SA01028-like IncHI2 (ST1) plasmids, being present in all of the investigated isolates from 2011 to 2017.IMPORTANCE Carbapenems are considered one of few remaining treatment options against multidrug-resistant Gram-negative pathogens in human clinical settings. The occurrence of carbapenemase-producing Enterobacteriaceae in livestock and food is a major public health concern. Particularly the occurrence of VIM-1-producing Salmonella Infantis in livestock farms is worrisome, as this zoonotic pathogen is one of the main causes for human salmonellosis in Europe. Investigations on the epidemiology of those carbapenemase-producing isolates and associated mobile genetic elements through an in-depth molecular characterization are indispensable to understand the transmission of carbapenemase-producing Enterobacteriaceae along the food chain and between different populations to develop strategies to prevent their further spread.Copyright © 2019 Roschanski et al.


April 21, 2020  |  

A critical comparison of technologies for a plant genome sequencing project.

A high-quality genome sequence of any model organism is an essential starting point for genetic and other studies. Older clone-based methods are slow and expensive, whereas faster, cheaper short-read-only assemblies can be incomplete and highly fragmented, which minimizes their usefulness. The last few years have seen the introduction of many new technologies for genome assembly. These new technologies and associated new algorithms are typically benchmarked on microbial genomes or, if they scale appropriately, on larger (e.g., human) genomes. However, plant genomes can be much more repetitive and larger than the human genome, and plant biochemistry often makes obtaining high-quality DNA that is free from contaminants difficult. Reflecting their challenging nature, we observe that plant genome assembly statistics are typically poorer than for vertebrates.Here, we compare Illumina short read, Pacific Biosciences long read, 10x Genomics linked reads, Dovetail Hi-C, and BioNano Genomics optical maps, singly and combined, in producing high-quality long-range genome assemblies of the potato species Solanum verrucosum. We benchmark the assemblies for completeness and accuracy, as well as DNA compute requirements and sequencing costs.The field of genome sequencing and assembly is reaching maturity, and the differences we observe between assemblies are surprisingly small. We expect that our results will be helpful to other genome projects, and that these datasets will be used in benchmarking by assembly algorithm developers. © The Author(s) 2019. Published by Oxford University Press.


April 21, 2020  |  

The Modern View of B Chromosomes Under the Impact of High Scale Omics Analyses.

Supernumerary B chromosomes (Bs) are extra karyotype units in addition to A chromosomes, and are found in some fungi and thousands of animals and plant species. Bs are uniquely characterized due to their non-Mendelian inheritance, and represent one of the best examples of genomic conflict. Over the last decades, their genetic composition, function and evolution have remained an unresolved query, although a few successful attempts have been made to address these phenomena. A classical concept based on cytogenetics and genetics is that Bs are selfish and abundant with DNA repeats and transposons, and in most cases, they do not carry any function. However, recently, the modern quantum development of high scale multi-omics techniques has shifted B research towards a new-born field that we call “B-omics”. We review the recent literature and add novel perspectives to the B research, discussing the role of new technologies to understand the mechanistic perspectives of the molecular evolution and function of Bs. The modern view states that B chromosomes are enriched with genes for many significant biological functions, including but not limited to the interesting set of genes related to cell cycle and chromosome structure. Furthermore, the presence of B chromosomes could favor genomic rearrangements and influence the nuclear environment affecting the function of other chromatin regions. We hypothesize that B chromosomes might play a key function in driving their transmission and maintenance inside the cell, as well as offer an extra genomic compartment for evolution.


April 21, 2020  |  

Divergent evolution in the genomes of closely related lacertids, Lacerta viridis and L. bilineata, and implications for speciation.

Lacerta viridis and Lacerta bilineata are sister species of European green lizards (eastern and western clades, respectively) that, until recently, were grouped together as the L. viridis complex. Genetic incompatibilities were observed between lacertid populations through crossing experiments, which led to the delineation of two separate species within the L. viridis complex. The population history of these sister species and processes driving divergence are unknown. We constructed the first high-quality de novo genome assemblies for both L. viridis and L. bilineata through Illumina and PacBio sequencing, with annotation support provided from transcriptome sequencing of several tissues. To estimate gene flow between the two species and identify factors involved in reproductive isolation, we studied their evolutionary history, identified genomic rearrangements, detected signatures of selection on non-coding RNA, and on protein-coding genes.Here we show that gene flow was primarily unidirectional from L. bilineata to L. viridis after their split at least 1.15 million years ago. We detected positive selection of the non-coding repertoire; mutations in transcription factors; accumulation of divergence through inversions; selection on genes involved in neural development, reproduction, and behavior, as well as in ultraviolet-response, possibly driven by sexual selection, whose contribution to reproductive isolation between these lacertid species needs to be further evaluated.The combination of short and long sequence reads resulted in one of the most complete lizard genome assemblies. The characterization of a diverse array of genomic features provided valuable insights into the demographic history of divergence among European green lizards, as well as key species differences, some of which are candidates that could have played a role in speciation. In addition, our study generated valuable genomic resources that can be used to address conservation-related issues in lacertids. © The Author(s) 2018. Published by Oxford University Press.


April 21, 2020  |  

Into the Thermus Mobilome: Presence, Diversity and Recent Activities of Insertion Sequences Across Thermus spp.

A high level of transposon-mediated genome rearrangement is a common trait among microorganisms isolated from thermal environments, probably contributing to the extraordinary genomic plasticity and horizontal gene transfer (HGT) observed in these habitats. In this work, active and inactive insertion sequences (ISs) spanning the sequenced members of the genus Thermus were characterized, with special emphasis on three T. thermophilus strains: HB27, HB8, and NAR1. A large number of full ISs and fragments derived from different IS families were found, concentrating within megaplasmids present in most isolates. Potentially active ISs were identified through analysis of transposase integrity, and domestication-related transposition events of ISTth7 were identified in laboratory-adapted HB27 derivatives. Many partial copies of ISs appeared throughout the genome, which may serve as specific targets for homologous recombination contributing to genome rearrangement. Moreover, recruitment of IS1000 32 bp segments as spacers for CRISPR sequence was identified, pointing to the adaptability of these elements in the biology of these thermophiles. Further knowledge about the activity and functional diversity of ISs in this genus may contribute to the generation of engineered transposons as new genetic tools, and enrich our understanding of the outstanding plasticity shown by these thermophiles.


April 21, 2020  |  

Whole-genome sequence of the oriental lung fluke Paragonimus westermani.

Foodborne infections caused by lung flukes of the genus Paragonimus are a significant and widespread public health problem in tropical areas. Approximately 50 Paragonimus species have been reported to infect animals and humans, but Paragonimus westermani is responsible for the bulk of human disease. Despite their medical and economic importance, no genome sequence for any Paragonimus species is available.We sequenced and assembled the genome of P. westermani, which is among the largest of the known pathogen genomes with an estimated size of 1.1 Gb. A 922.8 Mb genome assembly was generated from Illumina and Pacific Biosciences (PacBio) sequence data, covering 84% of the estimated genome size. The genome has a high proportion (45%) of repeat-derived DNA, particularly of the long interspersed element and long terminal repeat subtypes, and the expansion of these elements may explain some of the large size. We predicted 12,852 protein coding genes, showing a high level of conservation with related trematode species. The majority of proteins (80%) had homologs in the human liver fluke Opisthorchis viverrini, with an average sequence identity of 64.1%. Assembly of the P. westermani mitochondrial genome from long PacBio reads resulted in a single high-quality circularized 20.6 kb contig. The contig harbored a 6.9 kb region of non-coding repetitive DNA comprised of three distinct repeat units. Our results suggest that the region is highly polymorphic in P. westermani, possibly even within single worm isolates.The generated assembly represents the first Paragonimus genome sequence and will facilitate future molecular studies of this important, but neglected, parasite group.


April 21, 2020  |  

The genome of Peromyscus leucopus, natural host for Lyme disease and other emerging infections.

The rodent Peromyscus leucopus is the natural reservoir of several tick-borne infections, including Lyme disease. To expand the knowledge base for this key species in life cycles of several pathogens, we assembled and scaffolded the P. leucopus genome. The resulting assembly was 2.45 Gb in total length, with 24 chromosome-length scaffolds harboring 97% of predicted genes. RNA sequencing following infection of P. leucopus with Borreliella burgdorferi, a Lyme disease agent, shows that, unlike blood, the skin is actively responding to the infection after several weeks. P. leucopus has a high level of segregating nucleotide variation, suggesting that natural resistance alleles to Crispr gene targeting constructs are likely segregating in wild populations. The reference genome will allow for experiments aimed at elucidating the mechanisms by which this widely distributed rodent serves as natural reservoir for several infectious diseases of public health importance, potentially enabling intervention strategies.


April 21, 2020  |  

Hidden genomic evolution in a morphospecies-The landscape of rapidly evolving genes in Tetrahymena.

A morphospecies is defined as a taxonomic species based wholly on morphology, but often morphospecies consist of clusters of cryptic species that can be identified genetically or molecularly. The nature of the evolutionary novelty that accompanies speciation in a morphospecies is an intriguing question. Morphospecies are particularly common among ciliates, a group of unicellular eukaryotes that separates 2 kinds of nuclei-the silenced germline nucleus (micronucleus [MIC]) and the actively expressed somatic nucleus (macronucleus [MAC])-within a common cytoplasm. Because of their very similar morphologies, members of the Tetrahymena genus are considered a morphospecies. We explored the hidden genomic evolution within this genus by performing a comprehensive comparative analysis of the somatic genomes of 10 species and the germline genomes of 2 species of Tetrahymena. These species show high genetic divergence; phylogenomic analysis suggests that the genus originated about 300 million years ago (Mya). Seven universal protein domains are preferentially included among the species-specific (i.e., the youngest) Tetrahymena genes. In particular, leucine-rich repeat (LRR) genes make the largest contribution to the high level of genome divergence of the 10 species. LRR genes can be sorted into 3 different age groups. Parallel evolutionary trajectories have independently occurred among LRR genes in the different Tetrahymena species. Thousands of young LRR genes contain tandem arrays of exactly 90-bp exons. The introns separating these exons show a unique, extreme phase 2 bias, suggesting a clonal origin and successive expansions of 90-bp-exon LRR genes. Identifying LRR gene age groups allowed us to document a Tetrahymena intron length cycle. The youngest 90-bp exon LRR genes in T. thermophila are concentrated in pericentromeric and subtelomeric regions of the 5 micronuclear chromosomes, suggesting that these regions act as genome innovation centers. Copies of a Tetrahymena Long interspersed element (LINE)-like retrotransposon are very frequently found physically adjacent to 90-bp exon/intron repeat units of the youngest LRR genes. We propose that Tetrahymena species have used a massive exon-shuffling mechanism, involving unequal crossing over possibly in concert with retrotransposition, to create the unique 90-bp exon array LRR genes.


April 21, 2020  |  

Reconstruction of the genomes of drug-resistant pathogens for outbreak investigation through metagenomic sequencing

Culture-independent methods that target genome fragments have shown promise in identifying certain pathogens, but the holy grail of comprehensive pathogen genome detection from microbiologically complex samples for subsequent forensic analyses remains a challenge. In the context of an investigation of a nosocomial outbreak, we used shotgun metagenomic sequencing of a human fecal sample and a neural network algorithm based on tetranucleotide frequency profiling to reconstruct microbial genomes and tested the same approach using rectal swabs from a second patient. The approach rapidly and readily detected the genome of Klebsiella pneumoniae carbapenemase (KPC)-producing K. pneumoniae in the patient fecal specimen and in the rectal swab sample, achieving a level of strain resolution that was sufficient for confident transmission inference during a highly clonal outbreak. The analysis also detected previously unrecognized colonization of the patient by vancomycin-resistant Enterococcus faecium, another multidrug-resistant bacterium.IMPORTANCE The study results reported here perfectly demonstrate the power and promise of clinical metagenomics to recover genome sequences of important drug-resistant bacteria and to rapidly provide rich data that inform outbreak investigations and treatment decisions, independently of the need to culture the organisms.


April 21, 2020  |  

A High-Quality Grapevine Downy Mildew Genome Assembly Reveals Rapidly Evolving and Lineage-Specific Putative Host Adaptation Genes.

Downy mildews are obligate biotrophic oomycete pathogens that cause devastating plant diseases on economically important crops. Plasmopara viticola is the causal agent of grapevine downy mildew, a major disease in vineyards worldwide. We sequenced the genome of Pl. viticola with PacBio long reads and obtained a new 92.94?Mb assembly with high contiguity (359 scaffolds for a N50 of 706.5?kb) due to a better resolution of repeat regions. This assembly presented a high level of gene completeness, recovering 1,592 genes encoding secreted proteins involved in plant-pathogen interactions. Plasmopara viticola had a two-speed genome architecture, with secreted protein-encoding genes preferentially located in gene-sparse, repeat-rich regions and evolving rapidly, as indicated by pairwise dN/dS values. We also used short reads to assemble the genome of Plasmopara muralis, a closely related species infecting grape ivy (Parthenocissus tricuspidata). The lineage-specific proteins identified by comparative genomics analysis included a large proportion of RxLR cytoplasmic effectors and, more generally, genes with high dN/dS values. We identified 270 candidate genes under positive selection, including several genes encoding transporters and components of the RNA machinery potentially involved in host specialization. Finally, the Pl. viticola genome assembly generated here will allow the development of robust population genomics approaches for investigating the mechanisms involved in adaptation to biotic and abiotic selective pressures in this species. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


April 21, 2020  |  

Comprehensive evaluation of non-hybrid genome assembly tools for third-generation PacBio long-read sequence data.

Long reads obtained from third-generation sequencing platforms can help overcome the long-standing challenge of the de novo assembly of sequences for the genomic analysis of non-model eukaryotic organisms. Numerous long-read-aided de novo assemblies have been published recently, which exhibited superior quality of the assembled genomes in comparison with those achieved using earlier second-generation sequencing technologies. Evaluating assemblies is important in guiding the appropriate choice for specific research needs. In this study, we evaluated 10 long-read assemblers using a variety of metrics on Pacific Biosciences (PacBio) data sets from different taxonomic categories with considerable differences in genome size. The results allowed us to narrow down the list to a few assemblers that can be effectively applied to eukaryotic assembly projects. Moreover, we highlight how best to use limited genomic resources for effectively evaluating the genome assemblies of non-model organisms. © The Author 2017. Published by Oxford University Press.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.