Menu
April 21, 2020  |  

Comparative genomics and pathogenicity potential of members of the Pseudomonas syringae species complex on Prunus spp.

Diseases on Prunus spp. have been associated with a large number of phylogenetically different pathovars and species within the P. syringae species complex. Despite their economic significance, there is a severe lack of genomic information of these pathogens. The high phylogenetic diversity observed within strains causing disease on Prunus spp. in nature, raised the question whether other strains or species within the P. syringae species complex were potentially pathogenic on Prunus spp.To gain insight into the genomic potential of adaptation and virulence in Prunus spp., a total of twelve de novo whole genome sequences of P. syringae pathovars and species found in association with diseases on cherry (sweet, sour and ornamental-cherry) and peach were sequenced. Strains sequenced in this study covered three phylogroups and four clades. These strains were screened in vitro for pathogenicity on Prunus spp. together with additional genome sequenced strains thus covering nine out of thirteen of the currently defined P. syringae phylogroups. Pathogenicity tests revealed that most of the strains caused symptoms in vitro and no obvious link was found between presence of known virulence factors and the observed pathogenicity pattern based on comparative genomics. Non-pathogenic strains were displaying a two to three times higher generation time when grown in rich medium.In this study, the first set of complete genomes of cherry associated P. syringae strains as well as the draft genome of the quarantine peach pathogen P. syringae pv. persicae were generated. The obtained genomic data were matched with phenotypic data in order to determine factors related to pathogenicity to Prunus spp. Results of this study suggest that the inability to cause disease on Prunus spp. in vitro is not the result of host specialization but rather linked to metabolic impairments of individual strains.


April 21, 2020  |  

Metaepigenomic analysis reveals the unexplored diversity of DNA methylation in an environmental prokaryotic community.

DNA methylation plays important roles in prokaryotes, and their genomic landscapes-prokaryotic epigenomes-have recently begun to be disclosed. However, our knowledge of prokaryotic methylation systems is focused on those of culturable microbes, which are rare in nature. Here, we used single-molecule real-time and circular consensus sequencing techniques to reveal the ‘metaepigenomes’ of a microbial community in the largest lake in Japan, Lake Biwa. We reconstructed 19 draft genomes from diverse bacterial and archaeal groups, most of which are yet to be cultured. The analysis of DNA chemical modifications in those genomes revealed 22 methylated motifs, nine of which were novel. We identified methyltransferase genes likely responsible for methylation of the novel motifs, and confirmed the catalytic specificities of four of them via transformation experiments using synthetic genes. Our study highlights metaepigenomics as a powerful approach for identification of the vast unexplored variety of prokaryotic DNA methylation systems in nature.


April 21, 2020  |  

Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation.

We describe a method that adds long-read sequencing to a mix of technologies used to assemble a highly complex cattle rumen microbial community, and provide a comparison to short read-based methods. Long-read alignments and Hi-C linkage between contigs support the identification of 188 novel virus-host associations and the determination of phage life cycle states in the rumen microbial community. The long-read assembly also identifies 94 antimicrobial resistance genes, compared to only seven alleles in the short-read assembly. We demonstrate novel techniques that work synergistically to improve characterization of biological features in a highly complex rumen microbial community.


April 21, 2020  |  

Long-read based de novo assembly of low-complexity metagenome samples results in finished genomes and reveals insights into strain diversity and an active phage system.

Complete and contiguous genome assemblies greatly improve the quality of subsequent systems-wide functional profiling studies and the ability to gain novel biological insights. While a de novo genome assembly of an isolated bacterial strain is in most cases straightforward, more informative data about co-existing bacteria as well as synergistic and antagonistic effects can be obtained from a direct analysis of microbial communities. However, the complexity of metagenomic samples represents a major challenge. While third generation sequencing technologies have been suggested to enable finished metagenome-assembled genomes, to our knowledge, the complete genome assembly of all dominant strains in a microbiome sample has not been demonstrated. Natural whey starter cultures (NWCs) are used in cheese production and represent low-complexity microbiomes. Previous studies of Swiss Gruyère and selected Italian hard cheeses, mostly based on amplicon metagenomics, concurred that three species generally pre-dominate: Streptococcus thermophilus, Lactobacillus helveticus and Lactobacillus delbrueckii.Two NWCs from Swiss Gruyère producers were subjected to whole metagenome shotgun sequencing using the Pacific Biosciences Sequel and Illumina MiSeq platforms. In addition, longer Oxford Nanopore Technologies MinION reads had to be generated for one to resolve repeat regions. Thereby, we achieved the complete assembly of all dominant bacterial genomes from these low-complexity NWCs, which was corroborated by a 16S rRNA amplicon survey. Moreover, two distinct L. helveticus strains were successfully co-assembled from the same sample. Besides bacterial chromosomes, we could also assemble several bacterial plasmids and phages and a corresponding prophage. Biologically relevant insights were uncovered by linking the plasmids and phages to their respective host genomes using DNA methylation motifs on the plasmids and by matching prokaryotic CRISPR spacers with the corresponding protospacers on the phages. These results could only be achieved by employing long-read sequencing data able to span intragenomic as well as intergenomic repeats.Here, we demonstrate the feasibility of complete de novo genome assembly of all dominant strains from low-complexity NWCs based on whole metagenomics shotgun sequencing data. This allowed to gain novel biological insights and is a fundamental basis for subsequent systems-wide omics analyses, functional profiling and phenotype to genotype analysis of specific microbial communities.


April 21, 2020  |  

The wild sweetpotato (Ipomoea trifida) genome provides insights into storage root development.

Sweetpotato (Ipomoea batatas (L.) Lam.) is the seventh most important crop in the world and is mainly cultivated for its underground storage root (SR). The genetic studies of this species have been hindered by a lack of high-quality reference sequence due to its complex genome structure. Diploid Ipomoea trifida is the closest relative and putative progenitor of sweetpotato, which is considered a model species for sweetpotato, including genetic, cytological, and physiological analyses.Here, we generated the chromosome-scale genome sequence of SR-forming diploid I. trifida var. Y22 with high heterozygosity (2.20%). Although the chromosome-based synteny analysis revealed that the I. trifida shared conserved karyotype with Ipomoea nil after the separation, I. trifida had a much smaller genome than I. nil due to more efficient eliminations of LTR-retrotransposons and lack of species-specific amplification bursts of LTR-RTs. A comparison with four non-SR-forming species showed that the evolution of the beta-amylase gene family may be related to SR formation. We further investigated the relationship of the key gene BMY11 (with identity 47.12% to beta-amylase 1) with this important agronomic trait by both gene expression profiling and quantitative trait locus (QTL) mapping. And combining SR morphology and structure, gene expression profiling and qPCR results, we deduced that the products of the activity of BMY11 in splitting starch granules and be recycled to synthesize larger granules, contributing to starch accumulation and SR swelling. Moreover, we found the expression pattern of BMY11, sporamin proteins and the key genes involved in carbohydrate metabolism and stele lignification were similar to that of sweetpotato during the SR development.We constructed the high-quality genome reference of the highly heterozygous I. trifida through a combined approach and this genome enables a better resolution of the genomics feature and genome evolutions of this species. Sweetpotato SR development genes can be identified in I. trifida and these genes perform similar functions and patterns, showed that the diploid I. trifida var. Y22 with typical SR could be considered an ideal model for the studies of sweetpotato SR development.


April 21, 2020  |  

Characterization of a male specific region containing a candidate sex determining gene in Atlantic cod.

The genetic mechanisms determining sex in teleost fishes are highly variable and the master sex determining gene has only been identified in few species. Here we characterize a male-specific region of 9?kb on linkage group 11 in Atlantic cod (Gadus morhua) harboring a single gene named zkY for zinc knuckle on the Y chromosome. Diagnostic PCR test of phenotypically sexed males and females confirm the sex-specific nature of the Y-sequence. We identified twelve highly similar autosomal gene copies of zkY, of which eight code for proteins containing the zinc knuckle motif. 3D modeling suggests that the amino acid changes observed in six copies might influence the putative RNA-binding specificity. Cod zkY and the autosomal proteins zk1 and zk2 possess an identical zinc knuckle structure, but only the Y-specific gene zkY was expressed at high levels in the developing larvae before the onset of sex differentiation. Collectively these data suggest zkY as a candidate master masculinization gene in Atlantic cod. PCR amplification of Y-sequences in Arctic cod (Arctogadus glacialis) and Greenland cod (Gadus macrocephalus ogac) suggests that the male-specific region emerged in codfishes more than 7.5 million years ago.


April 21, 2020  |  

Haplotype-aware diplotyping from noisy long reads.

Current genotyping approaches for single-nucleotide variations rely on short, accurate reads from second-generation sequencing devices. Presently, third-generation sequencing platforms are rapidly becoming more widespread, yet approaches for leveraging their long but error-prone reads for genotyping are lacking. Here, we introduce a novel statistical framework for the joint inference of haplotypes and genotypes from noisy long reads, which we term diplotyping. Our technique takes full advantage of linkage information provided by long reads. We validate hundreds of thousands of candidate variants that have not yet been included in the high-confidence reference set of the Genome-in-a-Bottle effort.


April 21, 2020  |  

A draft genome for Spatholobus suberectus.

Spatholobus suberectus Dunn (S. suberectus), which belongs to the Leguminosae, is an important medicinal plant in China. Owing to its long growth cycle and increased use in human medicine, wild resources of S. suberectus have decreased rapidly and may be on the verge of extinction. De novo assembly of the whole S. suberectus genome provides us a critical potential resource towards biosynthesis of the main bioactive components and seed development regulation mechanism of this plant. Utilizing several sequencing technologies such as Illumina HiSeq X Ten, single-molecule real-time sequencing, 10x Genomics, as well as new assembly techniques such as FALCON and chromatin interaction mapping (Hi-C), we assembled a chromosome-scale genome about 798?Mb in size. In total, 748?Mb (93.73%) of the contig sequences were anchored onto nine chromosomes with the longest scaffold being 103.57?Mb. Further annotation analyses predicted 31,634 protein-coding genes, of which 93.9% have been functionally annotated. All data generated in this study is available in public databases.


April 21, 2020  |  

Adaptive Strategies in a Poly-Extreme Environment: Differentiation of Vegetative Cells in Serratia ureilytica and Resistance to Extreme Conditions.

Poly-extreme terrestrial habitats are often used as analogs to extra-terrestrial environments. Understanding the adaptive strategies allowing bacteria to thrive and survive under these conditions could help in our quest for extra-terrestrial planets suitable for life and understanding how life evolved in the harsh early earth conditions. A prime example of such a survival strategy is the modification of vegetative cells into resistant resting structures. These differentiated cells are often observed in response to harsh environmental conditions. The environmental strain (strain Lr5/4) belonging to Serratia ureilytica was isolated from a geothermal spring in Lirima, Atacama Desert, Chile. The Atacama Desert is the driest habitat on Earth and furthermore, due to its high altitude, it is exposed to an increased amount of UV radiation. The geothermal spring from which the strain was isolated is oligotrophic and the temperature of 54°C exceeds mesophilic conditions (15 to 45°C). Although the vegetative cells were tolerant to various environmental insults (desiccation, extreme pH, glycerol), a modified cell type was formed in response to nutrient deprivation, UV radiation and thermal shock. Scanning (SEM) and Transmission Electron Microscopy (TEM) analyses of vegetative cells and the modified cell structures were performed. In SEM, a change toward a circular shape with reduced size was observed. These circular cells possessed what appears as extra coating layers under TEM. The resistance of the modified cells was also investigated, they were resistant to wet heat, UV radiation and desiccation, while vegetative cells did not withstand any of those conditions. A phylogenomic analysis was undertaken to investigate the presence of known genes involved in dormancy in other bacterial clades. Genes related to spore-formation in Myxococcus and Firmicutes were found in S. ureilytica Lr5/4 genome; however, these genes were not enough for a full sporulation pathway that resembles either group. Although, the molecular pathway of cell differentiation in S. ureilytica Lr5/4 is not fully defined, the identified genes may contribute to the modified phenotype in the Serratia genus. Here, we show that a modified cell structure can occur as a response to extremity in a species that was previously not known to deploy this strategy. This strategy may be widely spread in bacteria, but only expressed under poly-extreme environmental conditions.


April 21, 2020  |  

Systematic analysis of dark and camouflaged genes reveals disease-relevant genes hiding in plain sight.

The human genome contains “dark” gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations within these gene regions that may be relevant to human disease. Here, we identify regions with few mappable reads that we call dark by depth, and others that have ambiguous alignment, called camouflaged. We assess how well long-read or linked-read technologies resolve these regions.Based on standard whole-genome Illumina sequencing data, we identify 36,794 dark regions in 6054 gene bodies from pathways important to human health, development, and reproduction. Of these gene bodies, 8.7% are completely dark and 35.2% are =?5% dark. We identify dark regions that are present in protein-coding exons across 748 genes. Linked-read or long-read sequencing technologies from 10x Genomics, PacBio, and Oxford Nanopore Technologies reduce dark protein-coding regions to approximately 50.5%, 35.6%, and 9.6%, respectively. We present an algorithm to resolve most camouflaged regions and apply it to the Alzheimer’s Disease Sequencing Project. We rescue a rare ten-nucleotide frameshift deletion in CR1, a top Alzheimer’s disease gene, found in disease cases but not in controls.While we could not formally assess the association of the CR1 frameshift mutation with Alzheimer’s disease due to insufficient sample-size, we believe it merits investigating in a larger cohort. There remain thousands of potentially important genomic regions overlooked by short-read sequencing that are largely resolved by long-read technologies.


April 21, 2020  |  

Linking CRISPR-Cas9 interference in cassava to the evolution of editing-resistant geminiviruses.

Geminiviruses cause damaging diseases in several important crop species. However, limited progress has been made in developing crop varieties resistant to these highly diverse DNA viruses. Recently, the bacterial CRISPR/Cas9 system has been transferred to plants to target and confer immunity to geminiviruses. In this study, we use CRISPR-Cas9 interference in the staple food crop cassava with the aim of engineering resistance to African cassava mosaic virus, a member of a widespread and important family (Geminiviridae) of plant-pathogenic DNA viruses.Our results show that the CRISPR system fails to confer effective resistance to the virus during glasshouse inoculations. Further, we find that between 33 and 48% of edited virus genomes evolve a conserved single-nucleotide mutation that confers resistance to CRISPR-Cas9 cleavage. We also find that in the model plant Nicotiana benthamiana the replication of the novel, mutant virus is dependent on the presence of the wild-type virus.Our study highlights the risks associated with CRISPR-Cas9 virus immunity in eukaryotes given that the mutagenic nature of the system generates viral escapes in a short time period. Our in-depth analysis of virus populations also represents a template for future studies analyzing virus escape from anti-viral CRISPR transgenics. This is especially important for informing regulation of such actively mutagenic applications of CRISPR-Cas9 technology in agriculture.


April 21, 2020  |  

Transcriptomic profiles of 33 opium poppy samples in different tissues, growth phases, and cultivars.

Opium poppy is one of the most important medicinal plants and remains the only commercial resource of morphinan-based painkillers. However, little is known about the regulatory mechanisms involved in benzylisoquinoline alkaloids (BIAs) biosynthesis in opium poppy. Herein, the full-length transcriptome dataset of opium poppy was constructed for the first time in accompanied with the 33 samples of Illumina transcriptome data from different tissues, growth phases and cultivars. The long-read sequencing produced 902,140 raw reads with 55,114 high-quality transcripts, and short-read sequencing produced 1,923,679,864 clean reads with an average Q30 rate of 93%. The high-quality transcripts were subsequently quantified using the short reads, and the expression of each unigene among different samples was calculated as reads per kilobase per million mapped reads (RPKM). These data provide a foundation for opium poppy transcriptomic analysis, which may aid in capturing splice variants and some non-coding RNAs involved in the regulation of BIAs biosynthesis. It can also be used for genome assembly and annotation which will favor in new transcript identification.


April 21, 2020  |  

In the name of the rose: a roadmap for rose research in the genome era.

The recent completion of the rose genome sequence is not the end of a process, but rather a starting point that opens up a whole set of new and exciting activities. Next to a high-quality genome sequence other genomic tools have also become available for rose, including transcriptomics data, a high-density single-nucleotide polymorphism array and software to perform linkage and quantitative trait locus mapping in polyploids. Rose cultivars are highly heterogeneous and diverse. This vast diversity in cultivated roses can be explained through the genetic potential of the genus, introgressions from wild species into commercial tetraploid germplasm and the inimitable efforts of historical breeders. We can now investigate how this diversity can best be exploited and refined in future breeding work, given the rich molecular toolbox now available to the rose breeding community. This paper presents possible lines of research now that rose has entered the genomics era, and attempts to partially answer the question that arises after the completion of any draft genome sequence: ‘Now that we have “the” genome, what’s next?’. Having access to a genome sequence will allow both (fundamental) scientific and (applied) breeding-orientated questions to be addressed. We outline possible approaches for a number of these questions.


April 21, 2020  |  

A high-quality de novo genome assembly from a single mosquito using PacBio sequencing

A high-quality reference genome is a fundamental resource for functional genetics, comparative genomics, and population genomics, and is increasingly important for conservation biology. PacBio Single Molecule, Real-Time (SMRT) sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful technology for de novo genome assembly. Improvements in throughput and concomitant reductions in cost have made PacBio an attractive core technology for many large genome initiatives, however, relatively high DNA input requirements (~5 µg for standard library protocol) have placed PacBio out of reach for many projects on small organisms that have lower DNA content, or on projects with limited input DNA for other reasons. Here we present a high-quality de novo genome assembly from a single Anopheles coluzzii mosquito. A modified SMRTbell library construction protocol without DNA shearing and size selection was used to generate a SMRTbell library from just 100 ng of starting genomic DNA. The sample was run on the Sequel System with chemistry 3.0 and software v6.0, generating, on average, 25 Gb of sequence per SMRT Cell with 20 h movies, followed by diploid de novo genome assembly with FALCON-Unzip. The resulting curated assembly had high contiguity (contig N50 3.5 Mb) and completeness (more than 98% of conserved genes were present and full-length). In addition, this single-insect assembly now places 667 (>90%) of formerly unplaced genes into their appropriate chromosomal contexts in the AgamP4 PEST reference. We were also able to resolve maternal and paternal haplotypes for over 1/3 of the genome. By sequencing and assembling material from a single diploid individual, only two haplotypes were present, simplifying the assembly process compared to samples from multiple pooled individuals. The method presented here can be applied to samples with starting DNA amounts as low as 100 ng per 1 Gb genome size. This new low-input approach puts PacBio-based assemblies in reach for small highly heterozygous organisms that comprise much of the diversity of life.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.