April 21, 2020  |  

Genome assembly and annotation of the Trichoplusia ni Tni-FNL insect cell line enabled by long-read technologies.

Trichoplusiani derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in this host, we performed de novo genome assembly of the Trichoplusiani-derived cell line Tni-FNL.By integration of PacBio single-molecule sequencing, Bionano optical mapping, and 10X Genomics linked-reads data, we have produced a draft genome assembly of Tni-FNL.Our assembly contains 280 scaffolds, with a N50 scaffold size of 2.3 Mb and a total length of 359 Mb. Annotation of the Tni-FNL genome resulted in 14,101 predicted genes and 93.2% of the predicted proteome contained recognizable protein domains. Ortholog searches within the superorder Holometabola provided further evidence of high accuracy and completeness of the Tni-FNL genome assembly.This first draft Tni-FNL genome assembly was enabled by complementary long-read technologies and represents a high-quality, well-annotated genome that provides novel insight into the complexity of this insect cell line and can serve as a reference for future large-scale genome engineering work in this and other similar recombinant protein production hosts.


April 21, 2020  |  

Into the Thermus Mobilome: Presence, Diversity and Recent Activities of Insertion Sequences Across Thermus spp.

A high level of transposon-mediated genome rearrangement is a common trait among microorganisms isolated from thermal environments, probably contributing to the extraordinary genomic plasticity and horizontal gene transfer (HGT) observed in these habitats. In this work, active and inactive insertion sequences (ISs) spanning the sequenced members of the genus Thermus were characterized, with special emphasis on three T. thermophilus strains: HB27, HB8, and NAR1. A large number of full ISs and fragments derived from different IS families were found, concentrating within megaplasmids present in most isolates. Potentially active ISs were identified through analysis of transposase integrity, and domestication-related transposition events of ISTth7 were identified in laboratory-adapted HB27 derivatives. Many partial copies of ISs appeared throughout the genome, which may serve as specific targets for homologous recombination contributing to genome rearrangement. Moreover, recruitment of IS1000 32 bp segments as spacers for CRISPR sequence was identified, pointing to the adaptability of these elements in the biology of these thermophiles. Further knowledge about the activity and functional diversity of ISs in this genus may contribute to the generation of engineered transposons as new genetic tools, and enrich our understanding of the outstanding plasticity shown by these thermophiles.


April 21, 2020  |  

Whole-genome sequence of the oriental lung fluke Paragonimus westermani.

Foodborne infections caused by lung flukes of the genus Paragonimus are a significant and widespread public health problem in tropical areas. Approximately 50 Paragonimus species have been reported to infect animals and humans, but Paragonimus westermani is responsible for the bulk of human disease. Despite their medical and economic importance, no genome sequence for any Paragonimus species is available.We sequenced and assembled the genome of P. westermani, which is among the largest of the known pathogen genomes with an estimated size of 1.1 Gb. A 922.8 Mb genome assembly was generated from Illumina and Pacific Biosciences (PacBio) sequence data, covering 84% of the estimated genome size. The genome has a high proportion (45%) of repeat-derived DNA, particularly of the long interspersed element and long terminal repeat subtypes, and the expansion of these elements may explain some of the large size. We predicted 12,852 protein coding genes, showing a high level of conservation with related trematode species. The majority of proteins (80%) had homologs in the human liver fluke Opisthorchis viverrini, with an average sequence identity of 64.1%. Assembly of the P. westermani mitochondrial genome from long PacBio reads resulted in a single high-quality circularized 20.6 kb contig. The contig harbored a 6.9 kb region of non-coding repetitive DNA comprised of three distinct repeat units. Our results suggest that the region is highly polymorphic in P. westermani, possibly even within single worm isolates.The generated assembly represents the first Paragonimus genome sequence and will facilitate future molecular studies of this important, but neglected, parasite group.


April 21, 2020  |  

Adaptive archaic introgression of copy number variants and the discovery of previously unknown human genes

As they migrated out of Africa and into Europe and Asia, anatomically modern humans interbred with archaic hominins, such as Neanderthals and Denisovans. The result of this genetic introgression on the recipient populations has been of considerable interest, especially in cases of selection for specific archaic genetic variants. Hsieh et al. characterized adaptive structural variants and copy number variants that are likely targets of positive selection in Melanesians. Focusing on population-specific regions of the genome that carry duplicated genes and show an excess of amino acid replacements provides evidence for one of the mechanisms by which genetic novelty can arise and result in differentiation between human genomes.Science, this issue p. eaax2083INTRODUCTIONCharacterizing genetic variants underlying local adaptations in human populations is one of the central goals of evolutionary research. Most studies have focused on adaptive single-nucleotide variants that either arose as new beneficial mutations or were introduced after interbreeding with our now-extinct relatives, including Neanderthals and Denisovans. The adaptive role of copy number variants (CNVs), another well-known form of genomic variation generated through deletions or duplications that affect more base pairs in the genome, is less well understood, despite evidence that such mutations are subject to stronger selective pressures.RATIONALEThis study focuses on the discovery of introgressed and adaptive CNVs that have become enriched in specific human populations. We combine whole-genome CNV calling and population genetic inference methods to discover CNVs and then assess signals of selection after controlling for demographic history. We examine 266 publicly available modern human genomes from the Simons Genome Diversity Project and genomes of three ancient homininstextemdasha Denisovan, a Neanderthal from the Altai Mountains in Siberia, and a Neanderthal from Croatia. We apply long-read sequencing methods to sequence-resolve complex CNVs of interest specifically in the Melanesianstextemdashan Oceanian population distributed from Papua New Guinea to as far east as the islands of Fiji and known to harbor some of the greatest amounts of Neanderthal and Denisovan ancestry.RESULTSConsistent with the hypothesis of archaic introgression outside Africa, we find a significant excess of CNV sharing between modern non-African populations and archaic hominins (P = 0.039). Among Melanesians, we observe an enrichment of CNVs with potential signals of positive selection (n = 37 CNVs), of which 19 CNVs likely introgressed from archaic hominins. We show that Melanesian-stratified CNVs are significantly associated with signals of positive selection (P = 0.0323). Many map near or within genes associated with metabolism (e.g., ACOT1 and ACOT2), development and cell cycle or signaling (e.g., TNFRSF10D and CDK11A and CDK11B), or immune response (e.g., IFNLR1). We characterize two of the largest and most complex CNVs on chromosomes 16p11.2 and 8p21.3 that introgressed from Denisovans and Neanderthals, respectively, and are absent from most other human populations. At chromosome 16p11.2, we sequence-resolve a large duplication of >383 thousand base pairs (kbp) that originated from Denisovans and introgressed into the ancestral Melanesian population 60,000 to 170,000 years ago. This large duplication occurs at high frequency (>79%) in diverse Melanesian groups, shows signatures of positive selection, and maps adjacent to Homo sapienstextendashspecific duplications that predispose to rearrangements associated with autism. On chromosome 8p21.3, we identify a Melanesian haplotype that carries two CNVs, a ~6-kbp deletion, and a ~38-kbp duplication, with a Neanderthal origin and that introgressed into non-Africans 40,000 to 120,000 years ago. This CNV haplotype occurs at high frequency (44%) and shows signals consistent with a partial selective sweep in Melanesians. Using long-read sequencing genomic and transcriptomic data, we reconstruct the structure and complex evolutionary history for these two CNVs and discover previously undescribed duplicated genes (TNFRSF10D1, TNFRSF10D2, and NPIPB16) that show an excess of amino acid replacements consistent with the action of positive selection.CONCLUSIONOur results suggest that large CNVs originating in archaic hominins and introgressed into modern humans have played an important role in local population adaptation and represent an insufficiently studied source of large-scale genetic variation that is absent from current reference genomes.Large adaptive-introgressed CNVs at chromosomes 8p21.3 and 16p11.2 in Melanesians.The magnifying glasses highlight structural differences between the archaic (top) and reference (bottom) genomes. Neanderthal (red) and Denisovan (blue) haplotypes encompassing large CNVs occur at high frequencies in Melanesians (44 and 79%, respectively) but are absent (black) in all non-Melanesians. These CNVs create positively selected genes (TNFRSF10D1, TNFRSF10D2, and NPIPB16) that are absent from the reference genome.Copy number variants (CNVs) are subject to stronger selective pressure than single-nucleotide variants, but their roles in archaic introgression and adaptation have not been systematically investigated. We show that stratified CNVs are significantly associated with signatures of positive selection in Melanesians and provide evidence for adaptive introgression of large CNVs at chromosomes 16p11.2 and 8p21.3 from Denisovans and Neanderthals, respectively. Using long-read sequence data, we reconstruct the structure and complex evolutionary history of these polymorphisms and show that both encode positively selected genes absent from most human populations. Our results collectively suggest that large CNVs originating in archaic hominins and introgressed into modern humans have played an important role in local population adaptation and represent an insufficiently studied source of large-scale genetic variation.


April 21, 2020  |  

The comparative genomics and complex population history of Papio baboons.

Recent studies suggest that closely related species can accumulate substantial genetic and phenotypic differences despite ongoing gene flow, thus challenging traditional ideas regarding the genetics of speciation. Baboons (genus Papio) are Old World monkeys consisting of six readily distinguishable species. Baboon species hybridize in the wild, and prior data imply a complex history of differentiation and introgression. We produced a reference genome assembly for the olive baboon (Papio anubis) and whole-genome sequence data for all six extant species. We document multiple episodes of admixture and introgression during the radiation of Papio baboons, thus demonstrating their value as a model of complex evolutionary divergence, hybridization, and reticulation. These results help inform our understanding of similar cases, including modern humans, Neanderthals, Denisovans, and other ancient hominins.


April 21, 2020  |  

Long-read amplicon denoising.

Long-read next-generation amplicon sequencing shows promise for studying complete genes or genomes from complex and diverse populations. Current long-read sequencing technologies have challenging error profiles, hindering data processing and incorporation into downstream analyses. Here we consider the problem of how to reconstruct, free of sequencing error, the true sequence variants and their associated frequencies from PacBio reads. Called ‘amplicon denoising’, this problem has been extensively studied for short-read sequencing technologies, but current solutions do not always successfully generalize to long reads with high indel error rates. We introduce two methods: one that runs nearly instantly and is very accurate for medium length reads and high template coverage, and another, slower method that is more robust when reads are very long or coverage is lower. On two Mock Virus Community datasets with ground truth, each sequenced on a different PacBio instrument, and on a number of simulated datasets, we compare our two approaches to each other and to existing algorithms. We outperform all tested methods in accuracy, with competitive run times even for our slower method, successfully discriminating templates that differ by a just single nucleotide. Julia implementations of Fast Amplicon Denoising (FAD) and Robust Amplicon Denoising (RAD), and a webserver interface, are freely available. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020  |  

Chromulinavorax destructans, a pathogen of microzooplankton that provides a window into the enigmatic candidate phylum Dependentiae.

Members of the major candidate phylum Dependentiae (a.k.a. TM6) are widespread across diverse environments from showerheads to peat bogs; yet, with the exception of two isolates infecting amoebae, they are only known from metagenomic data. The limited knowledge of their biology indicates that they have a long evolutionary history of parasitism. Here, we present Chromulinavorax destructans (Strain SeV1) the first isolate of this phylum to infect a representative from a widespread and ecologically significant group of heterotrophic flagellates, the microzooplankter Spumella elongata (Strain CCAP 955/1). Chromulinavorax destructans has a reduced 1.2 Mb genome that is so specialized for infection that it shows no evidence of complete metabolic pathways, but encodes an extensive transporter system for importing nutrients and energy in the form of ATP from the host. Its replication causes extensive reorganization and expansion of the mitochondrion, effectively surrounding the pathogen, consistent with its dependency on the host for energy. Nearly half (44%) of the inferred proteins contain signal sequences for secretion, including many without recognizable similarity to proteins of known function, as well as 98 copies of proteins with an ankyrin-repeat domain; ankyrin-repeats are known effectors of host modulation, suggesting the presence of an extensive host-manipulation apparatus. These observations help to cement members of this phylum as widespread and diverse parasites infecting a broad range of eukaryotic microbes.


April 21, 2020  |  

Genome-Wide Screening for Enteric Colonization Factors in Carbapenem-Resistant ST258 Klebsiella pneumoniae.

A diverse, antibiotic-naive microbiota prevents highly antibiotic-resistant microbes, including carbapenem-resistant Klebsiella pneumoniae (CR-Kp), from achieving dense colonization of the intestinal lumen. Antibiotic-mediated destruction of the microbiota leads to expansion of CR-Kp in the gut, markedly increasing the risk of bacteremia in vulnerable patients. While preventing dense colonization represents a rational approach to reduce intra- and interpatient dissemination of CR-Kp, little is known about pathogen-associated factors that enable dense growth and persistence in the intestinal lumen. To identify genetic factors essential for dense colonization of the gut by CR-Kp, we constructed a highly saturated transposon mutant library with >150,000 unique mutations in an ST258 strain of CR-Kp and screened for in vitro growth and in vivo intestinal colonization in antibiotic-treated mice. Stochastic and partially reversible fluctuations in the representation of different mutations during dense colonization revealed the dynamic nature of intestinal microbial populations. We identified genes that are crucial for early and late stages of dense gut colonization and confirmed their role by testing isogenic mutants in in vivo competition assays with wild-type CR-Kp Screening of the transposon library also identified mutations that enhanced in vivo CR-Kp growth. These newly identified colonization factors may provide novel therapeutic opportunities to reduce intestinal colonization by CR-KpIMPORTANCEKlebsiella pneumoniae is a common cause of bloodstream infections in immunocompromised and hospitalized patients, and over the last 2 decades, some strains have acquired resistance to nearly all available antibiotics, including broad-spectrum carbapenems. The U.S. Centers for Disease Control and Prevention has listed carbapenem-resistant K. pneumoniae (CR-Kp) as an urgent public health threat. Dense colonization of the intestine by CR-Kp and other antibiotic-resistant bacteria is associated with an increased risk of bacteremia. Reducing the density of gut colonization by CR-Kp is likely to reduce their transmission from patient to patient in health care facilities as well as systemic infections. How CR-Kp expands and persists in the gut lumen, however, is poorly understood. Herein, we generated a highly saturated mutant library in a multidrug-resistant K. pneumoniae strain and identified genetic factors that are associated with dense gut colonization by K. pneumoniae This study sheds light on host colonization by K. pneumoniae and identifies potential colonization factors that contribute to high-density persistence of K. pneumoniae in the intestine. Copyright © 2019 Jung et al.


April 21, 2020  |  

A chromosome-level sequence assembly reveals the structure of the Arabidopsis thaliana Nd-1 genome and its gene set.

In addition to the BAC-based reference sequence of the accession Columbia-0 from the year 2000, several short read assemblies of THE plant model organism Arabidopsis thaliana were published during the last years. Also, a SMRT-based assembly of Landsberg erecta has been generated that identified translocation and inversion polymorphisms between two genotypes of the species. Here we provide a chromosome-arm level assembly of the A. thaliana accession Niederzenz-1 (AthNd-1_v2c) based on SMRT sequencing data. The best assembly comprises 69 nucleome sequences and displays a contig length of up to 16 Mbp. Compared to an earlier Illumina short read-based NGS assembly (AthNd-1_v1), a 75 fold increase in contiguity was observed for AthNd-1_v2c. To assign contig locations independent from the Col-0 gold standard reference sequence, we used genetic anchoring to generate a de novo assembly. In addition, we assembled the chondrome and plastome sequences. Detailed analyses of AthNd-1_v2c allowed reliable identification of large genomic rearrangements between A. thaliana accessions contributing to differences in the gene sets that distinguish the genotypes. One of the differences detected identified a gene that is lacking from the Col-0 gold standard sequence. This de novo assembly extends the known proportion of the A. thaliana pan-genome.


April 21, 2020  |  

Consensus and variations in cell line specificity among human metapneumovirus strains.

Human metapneumovirus (HMPV) has been a notable etiological agent of acute respiratory infection in humans, but it was not discovered until 2001, because HMPV replicates only in a limited number of cell lines and the cytopathic effect (CPE) is often mild. To promote the study of HMPV, several groups have generated green fluorescent protein (GFP)-expressing recombinant HMPV strains (HMPVGFP). However, the growing evidence has complicated the understanding of cell line specificity of HMPV, because it seems to vary notably among HMPV strains. In addition, unique A2b clade HMPV strains with a 180-nucleotide duplication in the G gene (HMPV A2b180nt-dup strains) have recently been detected. In this study, we re-evaluated and compared the cell line specificity of clinical isolates of HMPV strains, including the novel HMPV A2b180nt-dup strains, and six recombinant HMPVGFP strains, including the newly generated recombinant HMPV A2b180nt-dup strain, MG0256-EGFP. Our data demonstrate that VeroE6 and LLC-MK2 cells generally showed the highest infectivity with any clinical isolates and recombinant HMPVGFP strains. Other human-derived cell lines (BEAS-2B, A549, HEK293, MNT-1, and HeLa cells) showed certain levels of infectivity with HMPV, but these were significantly lower than those of VeroE6 and LLC-MK2 cells. Also, the infectivity in these suboptimal cell lines varied greatly among HMPV strains. The variations were not directly related to HMPV genotypes, cell lines used for isolation and propagation, specific genome mutations, or nucleotide duplications in the G gene. Thus, these variations in suboptimal cell lines are likely intrinsic to particular HMPV strains.


April 21, 2020  |  

A whole genome scan of SNP data suggests a lack of abundant hard selective sweeps in the genome of the broad host range plant pathogenic fungus Sclerotinia sclerotiorum.

The pathogenic fungus Sclerotinia sclerotiorum infects over 600 species of plant. It is present in numerous environments throughout the world and causes significant damage to many agricultural crops. Fragmentation and lack of gene flow between populations may lead to population sub-structure. Within discrete recombining populations, positive selection may lead to a ‘selective sweep’. This is characterised by an increase in frequency of a favourable allele leading to reduction in genotypic diversity in a localised genomic region due to the phenomenon of genetic hitchhiking. We aimed to assess whether isolates of S. sclerotiorum from around the world formed genotypic clusters associated with geographical origin and to determine whether signatures of population-specific positive selection could be detected. To do this, we sequenced the genomes of 25 isolates of S. sclerotiorum collected from four different continents-Australia, Africa (north and south), Europe and North America (Canada and the northen United States) and conducted SNP based analyses of population structure and selective sweeps. Among the 25 isolates, there was evidence for two major population clusters. One of these consisted of 11 isolates from Canada, the USA and France (population 1), and the other consisted of nine isolates from Australia and one from Morocco (population 2). The rest of the isolates were genotypic outliers. We found that there was evidence of outcrossing in these two populations based on linkage disequilibrium decay. However, only a single candidate selective sweep was observed, and it was present in population 2. This sweep was close to a Major Facilitator Superfamily transporter gene, and we speculate that this gene may have a role in nutrient uptake from the host. The low abundance of selective sweeps in the S. sclerotiorum genome contrasts the numerous examples in the genomes of other fungal pathogens. This may be a result of its slow rate of evolution and low effective recombination rate due to self-fertilisation and vegetative reproduction.


April 21, 2020  |  

Genome Sequence of Rhizobium jaguaris CCGE525T, a Strain Isolated from Calliandra grandiflora Nodules from a Rain Forest in Mexico.

We present the genome sequence of Rhizobium jaguaris CCGE525T, a nitrogen-fixing bacterium isolated from nodules of Calliandra grandiflora. CCGE525T belongs to Rhizobium tropici group A, represents the symbiovar calliandrae, and forms nitrogen-fixing nodules in Phaseolus vulgaris. Genome-based metrics and phylogenomic approaches support Rhizobium jaguaris as a novel species.


April 21, 2020  |  

Genome mining identifies cepacin as a plant-protective metabolite of the biopesticidal bacterium Burkholderia ambifaria.

Beneficial microorganisms are widely used in agriculture for control of plant pathogens, but a lack of efficacy and safety information has limited the exploitation of multiple promising biopesticides. We applied phylogeny-led genome mining, metabolite analyses and biological control assays to define the efficacy of Burkholderia ambifaria, a naturally beneficial bacterium with proven biocontrol properties but potential pathogenic risk. A panel of 64 B.?ambifaria strains demonstrated significant antimicrobial activity against priority plant pathogens. Genome sequencing, specialized metabolite biosynthetic gene cluster mining and metabolite analysis revealed an armoury of known and unknown pathways within B.?ambifaria. The biosynthetic gene cluster responsible for the production of the metabolite cepacin was identified and directly shown to mediate protection of germinating crops against Pythium damping-off disease. B.?ambifaria maintained biopesticidal protection and overall fitness in the soil after deletion of its third replicon, a non-essential plasmid associated with virulence in Burkholderia?cepacia complex bacteria. Removal of the third replicon reduced B.?ambifaria persistence in a murine respiratory infection model. Here, we show that by using interdisciplinary phylogenomic, metabolomic and functional approaches, the mode of action of natural biological control agents related to pathogens can be systematically established to facilitate their future exploitation.


April 21, 2020  |  

Parallels between natural selection in the cold-adapted crop-wild relative Tripsacum dactyloides and artificial selection in temperate adapted maize.

Artificial selection has produced varieties of domesticated maize that thrive in temperate climates around the world. However, the direct progenitor of maize, teosinte, is indigenous only to a relatively small range of tropical and subtropical latitudes and grows poorly or not at all outside of this region. Tripsacum, a sister genus to maize and teosinte, is naturally endemic to the majority of areas in the western hemisphere where maize is cultivated. A full-length reference transcriptome for Tripsacum dactyloides generated using long-read Iso-Seq data was used to characterize independent adaptation to temperate climates in this clade. Genes related to phospholipid biosynthesis, a critical component of cold acclimation in other cold-adapted plant lineages, were enriched among those genes experiencing more rapid rates of protein sequence evolution in T. dactyloides. In contrast with previous studies of parallel selection, we find that there is a significant overlap between the genes that were targets of artificial selection during the adaptation of maize to temperate climates and those that were targets of natural selection in temperate-adapted T. dactyloides. Genes related to growth, development, response to stimulus, signaling, and organelles were enriched in the set of genes identified as both targets of natural and artificial selection. © 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd.


April 21, 2020  |  

A High-Quality Grapevine Downy Mildew Genome Assembly Reveals Rapidly Evolving and Lineage-Specific Putative Host Adaptation Genes.

Downy mildews are obligate biotrophic oomycete pathogens that cause devastating plant diseases on economically important crops. Plasmopara viticola is the causal agent of grapevine downy mildew, a major disease in vineyards worldwide. We sequenced the genome of Pl. viticola with PacBio long reads and obtained a new 92.94?Mb assembly with high contiguity (359 scaffolds for a N50 of 706.5?kb) due to a better resolution of repeat regions. This assembly presented a high level of gene completeness, recovering 1,592 genes encoding secreted proteins involved in plant-pathogen interactions. Plasmopara viticola had a two-speed genome architecture, with secreted protein-encoding genes preferentially located in gene-sparse, repeat-rich regions and evolving rapidly, as indicated by pairwise dN/dS values. We also used short reads to assemble the genome of Plasmopara muralis, a closely related species infecting grape ivy (Parthenocissus tricuspidata). The lineage-specific proteins identified by comparative genomics analysis included a large proportion of RxLR cytoplasmic effectors and, more generally, genes with high dN/dS values. We identified 270 candidate genes under positive selection, including several genes encoding transporters and components of the RNA machinery potentially involved in host specialization. Finally, the Pl. viticola genome assembly generated here will allow the development of robust population genomics approaches for investigating the mechanisms involved in adaptation to biotic and abiotic selective pressures in this species. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.