July 19, 2019  |  

Single-Molecule Real-Time Sequencing combined with optical mapping yields completely finished fungal genome.

Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae. Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio-generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism’s biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes.Studying whole-genome sequences has become an important aspect of biological research. The advent of next-generation sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBio-generated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome analyses to facilitate functional studies into an organism’s biology. Copyright © 2015 Faino et al.


July 19, 2019  |  

Improved maize reference genome with single-molecule technologies.

Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate the determination of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions. Here we report the assembly and annotation of a reference genome of maize, a genetic and agricultural model species, using single-molecule real-time sequencing and high-resolution optical mapping. Relative to the previous reference genome, our assembly features a 52-fold increase in contig length and notable improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed more than 130,000 intact transposable elements, allowing us to identify transposable element lineage expansions that are unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by single-molecule real-time sequencing. In addition, comparative optical mapping of two other inbred maize lines revealed a prevalence of deletions in regions of low gene density and maize lineage-specific genes.


July 7, 2019  |  

Comparative genome analysis of Pseudomonas knackmussii B13, the first bacterium known to degrade chloroaromatic compounds.

Pseudomonas knackmussii B13 was the first strain to be isolated in 1974 that could degrade chlorinated aromatic hydrocarbons. This discovery was the prologue for subsequent characterization of numerous bacterial metabolic pathways, for genetic and biochemical studies, and which spurred ideas for pollutant bioremediation. In this study, we determined the complete genome sequence of B13 using next generation sequencing technologies and optical mapping. Genome annotation indicated that B13 has a variety of metabolic pathways for degrading monoaromatic hydrocarbons including chlorobenzoate, aminophenol, anthranilate and hydroxyquinol, but not polyaromatic compounds. Comparative genome analysis revealed that B13 is closest to Pseudomonas denitrificans and Pseudomonas aeruginosa. The B13 genome contains at least eight genomic islands [prophages and integrative conjugative elements (ICEs)], which were absent in closely related pseudomonads. We confirm that two ICEs are identical copies of the 103?kb self-transmissible element ICEclc that carries the genes for chlorocatechol metabolism. Comparison of ICEclc showed that it is composed of a variable and a ‘core’ region, which is very conserved among proteobacterial genomes, suggesting a widely distributed family of so far uncharacterized ICE. Resequencing of two spontaneous B13 mutants revealed a number of single nucleotide substitutions, as well as excision of a large 220?kb region and a prophage that drastically change the host metabolic capacity and survivability. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.


July 7, 2019  |  

Genome sequence of the Drosophila melanogaster male-killing Spiroplasma strain MSRO endosymbiont.

Spiroplasmas are helical and motile members of a cell wall-less eubacterial group called Mollicutes. Although all spiroplasmas are associated with arthropods, they exhibit great diversity with respect to both their modes of transmission and their effects on their hosts; ranging from horizontally transmitted pathogens and commensals to endosymbionts that are transmitted transovarially (i.e., from mother to offspring). Here we provide the first genome sequence, along with proteomic validation, of an endosymbiotic inherited Spiroplasma bacterium, the Spiroplasma poulsonii MSRO strain harbored by Drosophila melanogaster. Comparison of the genome content of S. poulsonii with that of horizontally transmitted spiroplasmas indicates that S. poulsonii has lost many metabolic pathways and transporters, demonstrating a high level of interdependence with its insect host. Consistent with genome analysis, experimental studies showed that S. poulsonii metabolizes glucose but not trehalose. Notably, trehalose is more abundant than glucose in Drosophila hemolymph, and the inability to metabolize trehalose may prevent S. poulsonii from overproliferating. Our study identifies putative virulence genes, notably, those for a chitinase, the H2O2-producing glycerol-3-phosphate oxidase, and enzymes involved in the synthesis of the eukaryote-toxic lipid cardiolipin. S. poulsonii also expresses on the cell membrane one functional adhesion-related protein and two divergent spiralin proteins that have been implicated in insect cell invasion in other spiroplasmas. These lipoproteins may be involved in the colonization of the Drosophila germ line, ensuring S. poulsonii vertical transmission. The S. poulsonii genome is a valuable resource to explore the mechanisms of male killing and symbiont-mediated protection, two cardinal features of many facultative endosymbionts.Most insect species, including important disease vectors and crop pests, harbor vertically transmitted endosymbiotic bacteria. These endosymbionts play key roles in their hosts’ fitness, including protecting them against natural enemies and manipulating their reproduction in ways that increase the frequency of symbiont infection. Little is known about the molecular mechanisms that underlie these processes. Here, we provide the first genome draft of a vertically transmitted male-killing Spiroplasma bacterium, the S. poulsonii MSRO strain harbored by D. melanogaster. Analysis of the S. poulsonii genome was complemented by proteomics and ex vivo metabolic experiments. Our results indicate that S. poulsonii has reduced metabolic capabilities and expresses divergent membrane lipoproteins and potential virulence factors that likely participate in Spiroplasma-host interactions. This work fills a gap in our knowledge of insect endosymbionts and provides tools with which to decipher the interaction between Spiroplasma bacteria and their well-characterized host D. melanogaster, which is emerging as a model of endosymbiosis. Copyright © 2015 Paredes et al.


July 7, 2019  |  

De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping.

It remains a challenge to perform de novo assembly using next-generation sequencing (NGS). Despite the availability of multiple sequencing technologies and tools (e.g., assemblers) it is still difficult to assemble new genomes at chromosome resolution (i.e., one sequence per chromosome). Obtaining high quality draft assemblies is extremely important in the case of yeast genomes to better characterise major events in their evolutionary history. The aim of this work is two-fold: on the one hand we want to show how combining different and somewhat complementary technologies is key to improving assembly quality and correctness, and on the other hand we present a de novo assembly pipeline we believe to be beneficial to core facility bioinformaticians. To demonstrate both the effectiveness of combining technologies and the simplicity of the pipeline, here we present the results obtained using the Dekkera bruxellensis genome.In this work we used short-read Illumina data and long-read PacBio data combined with the extreme long-range information from OpGen optical maps in the task of de novo genome assembly and finishing. Moreover, we developed NouGAT, a semi-automated pipeline for read-preprocessing, de novo assembly and assembly evaluation, which was instrumental for this work.We obtained a high quality draft assembly of a yeast genome, resolved on a chromosomal level. Furthermore, this assembly was corrected for mis-assembly errors as demonstrated by resolving a large collapsed repeat and by receiving higher scores by assembly evaluation tools. With the inclusion of PacBio data we were able to fill about 5 % of the optical mapped genome not covered by the Illumina data.


July 7, 2019  |  

High-coverage sequencing and annotated assemblies of the budgerigar genome.

Parrots belong to a group of behaviorally advanced vertebrates and have an advanced ability of vocal learning relative to other vocal-learning birds. They can imitate human speech, synchronize their body movements to a rhythmic beat, and understand complex concepts of referential meaning to sounds. However, little is known about the genetics of these traits. Elucidating the genetic bases would require whole genome sequencing and a robust assembly of a parrot genome.We present a genomic resource for the budgerigar, an Australian Parakeet (Melopsittacus undulatus) — the most widely studied parrot species in neuroscience and behavior. We present genomic sequence data that includes over 300× raw read coverage from multiple sequencing technologies and chromosome optical maps from a single male animal. The reads and optical maps were used to create three hybrid assemblies representing some of the largest genomic scaffolds to date for a bird; two of which were annotated based on similarities to reference sets of non-redundant human, zebra finch and chicken proteins, and budgerigar transcriptome sequence assemblies. The sequence reads for this project were in part generated and used for both the Assemblathon 2 competition and the first de novo assembly of a giga-scale vertebrate genome utilizing PacBio single-molecule sequencing.Across several quality metrics, these budgerigar assemblies are comparable to or better than the chicken and zebra finch genome assemblies built from traditional Sanger sequencing reads, and are sufficient to analyze regions that are difficult to sequence and assemble, including those not yet assembled in prior bird genomes, and promoter regions of genes differentially regulated in vocal learning brain regions. This work provides valuable data and material for genome technology development and for investigating the genomics of complex behavioral traits.


July 7, 2019  |  

Enterobacter asburiae strain L1: complete genome and whole genome optical mapping analysis of a quorum sensing bacterium.

Enterobacter asburiae L1 is a quorum sensing bacterium isolated from lettuce leaves. In this study, for the first time, the complete genome of E. asburiae L1 was sequenced using the single molecule real time sequencer (PacBio RSII) and the whole genome sequence was verified by using optical genome mapping (OpGen) technology. In our previous study, E. asburiae L1 has been reported to produce AHLs, suggesting the possibility of virulence factor regulation which is quorum sensing dependent. This evoked our interest to study the genome of this bacterium and here we present the complete genome of E. asburiae L1, which carries the virulence factor gene virK, the N-acyl homoserine lactone-based QS transcriptional regulator gene luxR and the N-acyl homoserine lactone synthase gene which we firstly named easI. The availability of the whole genome sequence of E. asburiae L1 will pave the way for the study of the QS-mediated gene expression in this bacterium. Hence, the importance and functions of these signaling molecules can be further studied in the hope of elucidating the mechanisms of QS-regulation in E. asburiae. To the best of our knowledge, this is the first documentation of both a complete genome sequence and the establishment of the molecular basis of QS properties of E. asburiae.


July 7, 2019  |  

Potential impact on kidney infection: a whole-genome analysis of Leptospira santarosai serovar Shermani.

Leptospira santarosai serovar Shermani is the most frequently encountered serovar, and it causes leptospirosis and tubulointerstitial nephritis in Taiwan. This study aims to complete the genome sequence of L. santarosai serovar Shermani and analyze the transcriptional responses of L. santarosai serovar Shermani to renal tubular cells. To assemble this highly repetitive genome, we combined reads that were generated from four next-generation sequencing platforms by using hybrid assembly approaches to finish two-chromosome contiguous sequences without gaps by validating the data with optical restriction maps and Sanger sequencing. Whole-genome comparison studies revealed a 28-kb region containing genes that encode transposases and hypothetical proteins in L. santarosai serovar Shermani, but this region is absent in other pathogenic Leptospira spp. We found that lipoprotein gene expression in both L. santarosai serovar Shermani and L. interrogans serovar Copenhageni were upregulated upon interaction with renal tubular cells, and LSS19962, a L. santarosai serovar Shermani-specific gene within a 28-kb region that encodes hypothetical proteins, was upregulated in L. santarosai serovar Shermani-infected renal tubular cells. Lipoprotein expression during leptospiral infection might facilitate the interactions of leptospires within kidneys. The availability of the whole-genome sequence of L. santarosai serovar Shermani would make it the first completed sequence of this species, and its comparison with that of other Leptospira spp. may provide invaluable information for further studies in leptospiral pathogenesis.


July 7, 2019  |  

Single-molecule fluorescence imaging of processive myosin with enhanced background suppression using linear zero-mode waveguides (ZMWs) and convex lens induced confinement (CLIC).

Resolving single fluorescent molecules in the presence of high fluorophore concentrations remains a challenge in single-molecule biophysics that limits our understanding of weak molecular interactions. Total internal reflection fluorescence (TIRF) imaging, the workhorse of single-molecule fluorescence microscopy, enables experiments at concentrations up to about 100 nM, but many biological interactions have considerably weaker affinities, and thus require at least one species to be at micromolar or higher concentration. Current alternatives to TIRF often require three-dimensional confinement, and thus can be problematic for extended substrates, such as cytoskeletal filaments. To address this challenge, we have demonstrated and applied two new single-molecule fluorescence microscopy techniques, linear zero-mode waveguides (ZMWs) and convex lens induced confinement (CLIC), for imaging the processive motion of molecular motors myosin V and VI along actin filaments. Both technologies will allow imaging in the presence of higher fluorophore concentrations than TIRF microscopy. They will enable new biophysical measurements of a wide range of processive molecular motors that move along filamentous tracks, such as other myosins, dynein, and kinesin. A particularly salient application of these technologies will be to examine chemomechanical coupling by directly imaging fluorescent nucleotide molecules interacting with processive motors as they traverse their actin or microtubule tracks.


July 7, 2019  |  

Structure of the type IV secretion system in different strains of Anaplasma phagocytophilum.

Anaplasma phagocytophilum is an intracellular organism in the Order Rickettsiales that infects diverse animal species and is causing an emerging disease in humans, dogs and horses. Different strains have very different cell tropisms and virulence. For example, in the U.S., strains have been described that infect ruminants but not dogs or rodents. An intriguing question is how the strains of A. phagocytophilum differ and what different genome loci are involved in cell tropisms and/or virulence. Type IV secretion systems (T4SS) are responsible for translocation of substrates across the cell membrane by mechanisms that require contact with the recipient cell. They are especially important in organisms such as the Rickettsiales which require T4SS to aid colonization and survival within both mammalian and tick vector cells. We determined the structure of the T4SS in 7 strains from the U.S. and Europe and revised the sequence of the repetitive virB6 locus of the human HZ strain.Although in all strains the T4SS conforms to the previously described split loci for vir genes, there is great diversity within these loci among strains. This is particularly evident in the virB2 and virB6 which are postulated to encode the secretion channel and proteins exposed on the bacterial surface. VirB6-4 has an unusual highly repetitive structure and can have a molecular weight greater than 500,000. For many of the virs, phylogenetic trees position A. phagocytophilum strains infecting ruminants in the U.S. and Europe distant from strains infecting humans and dogs in the U.S.Our study reveals evidence of gene duplication and considerable diversity of T4SS components in strains infecting different animals. The diversity in virB2 is in both the total number of copies, which varied from 8 to 15 in the herein characterized strains, and in the sequence of each copy. The diversity in virB6 is in the sequence of each of the 4 copies in the single locus and the presence of varying numbers of repetitive units in virB6-3 and virB6-4. These data suggest that the T4SS should be investigated further for a potential role in strain virulence of A. phagocytophilum.


July 7, 2019  |  

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

Long-read sequencing can overcome the weaknesses of short reads in the assembly of eukaryotic genomes, however, at present additional scaffolding is needed to achieve chromosome-level assemblies. We generated PacBio long-read data of the genomes of three relatives of the model plant Arabidopsis thaliana and assembled all three genomes into only a few hundred contigs. To improve the contiguities of these assemblies, we generated BioNano Genomics optical mapping and Dovetail Genomics chromosome conformation capture data for genome scaffolding. Despite their technical differences, optical mapping and chromosome conformation capture performed similarly and doubled N50 values. After improving both integration methods, assembly contiguity reached chromosome-arm-levels. We rigorously assessed the quality of contigs and scaffolds using Illumina mate-pair libraries and genetic map information. This showed that PacBio assemblies have high sequence accuracy but can contain several misassemblies, which join unlinked regions of the genome. Most, but not all of these mis-joints were removed during the integration of the optical mapping and chromosome conformation capture data. Even though none of the centromeres was fully assembled, the scaffolds revealed large parts of some centromeric regions, even including some of the heterochromatic regions, which are not present in gold standard reference sequences. Published by Cold Spring Harbor Laboratory Press.


July 7, 2019  |  

Combination of short-read, long-read and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications.

Accurate and contiguous genome assembly is key to a comprehensive understanding of the processes shaping genomic diversity and evolution. Yet, it is frequently constrained by constitutive heterochromatin, usually characterized by highly repetitive DNA. As a key feature of genome architecture associated with centromeric and telomeric regions it influences meiotic recombination. In this study, we assess the impact of large tandem repeat arrays on the recombination rate landscape in an avian speciation model, the Eurasian crow. We assembled two high-quality genome references using single-molecule real-time sequencing (long-read assembly, LR) and single-molecule restriction maps (optical map assembly, OM). A three-way comparison including the published short-read assembly (SR) constructed for the same individual allowed assessing assembly properties and pinpointing mis-assemblies. Combining information from all three assemblies, we characterized 36 previously unidentified large repetitive regions in the proximity of sequence assembly breakpoints, the majority of which contained complex arrays of a 14-kb satellite repeat or its 1.2-kb subunit. Using genome-wide population re-sequencing data, we estimated the population-scaled recombination rate (?) and found it to be significantly reduced in these regions. These findings are consistent with an effect of low recombination in regions adjacent to centromeric or subtelomeric heterochromatin, and add to our understanding of the processes generating widespread heterogeneity in genetic diversity and differentiation along the genome. By combining three independent technologies, our results highlight the importance of adding a layer of information on genome structure inaccessible to each approach independently. Published by Cold Spring Harbor Laboratory Press.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.