June 1, 2021  |  

Single Molecule, Real-Time Sequencing for base modification detection in eukaryotic organisms: Coprinopsis cinerea.

Single Molecule Real-Time (SMRT) DNA sequencing provides a wealth of kinetic information beyond the extraction of the primary DNA sequence, and this kinetic information can provide for the direct detection of modified bases present in genomic DNA. This method has been demonstrated for base modification detection in prokaryotes at base and strand resolutions. In eukaryotes, the common base modifications known to exist are the cytosine variants including methyl, hydroxymethyl, formyl and carboxyl forms. Each of these modifications exhibits different signatures in SMRT kinetic data, allowing for unprecedented possibilities to differentiate between them in direct sequencing data. We present early results of directly sequencing different base modifications in eukaryotic genomic DNA using this method.


June 1, 2021  |  

De novo assembly of a complex panicoid grass genome using ultra-long PacBio reads with P6C4 chemistry

Drought is responsible for much of the global losses in crop yields and understanding how plants naturally cope with drought stress is essential for breeding and engineering crops for the changing climate. Resurrection plants desiccate to complete dryness during times of drought, then “come back to life” once water is available making them an excellent model for studying drought tolerance. Understanding the molecular networks governing how resurrection plants handle desiccation will provide targets for crop engineering. Oropetium thomaeum (Oro) is a resurrection plant that also has the smallest known grass genome at 250 Mb compared to Brachypodium distachyon (300 Mb) and rice (350 Mb). Plant genomes, especially grasses, have complex repeat structures such as telomeres, centromeres, and ribosomal gene cassettes, and high heterozygosity, which makes them difficult to assembly using short read next generation sequencing technologies. Ultra-long PacBio reads using the new P6C4 chemistry and the latest 15kb Blue Pippin size-selection protocol to generate 20kb insert libraries that yielded an average read length of 12kb providing ~72X coverage, and 10X coverage with reads over 20kb. The HGAP assembly covers 98% of the genome with a contig N50 of 2.4 Mb, which makes it one of the highest quality and most complete plant genomes assembled to date. Oro has a compact genome structure compared to other grasses with only 16% repeat sequences but has very good collinearity with other grasses. Understanding the genomic mechanisms of extreme desiccation tolerance in resurrection plants like Oro will provide insights for engineering and intelligent breeding of improved food, fuel, and fiber crops.


June 1, 2021  |  

The resurgence of reference quality genome sequence.

Since the advent of Next-Generation Sequencing (NGS), the cost of de novo genome sequencing and assembly have dropped precipitately, which has spurred interest in genome sequencing overall. Unfortunately the contiguity of the NGS assembled sequences, as well as the accuracy of these assemblies have suffered. Additionally, most NGS de novo assemblies leave large portions of genomes unresolved, and repetitive regions are often collapsed. When compared to the reference quality genome sequences produced before the NGS era, the new sequences are highly fragmented and often prove to be difficult to properly annotate. In some cases the contiguous portions are smaller than the average gene size making the sequence not nearly as useful for biologists as the earlier reference quality genomes including of Human, Mouse, C. elegans, or Drosophila. Recently, new 3rd generation sequencing technologies, long-range molecular techniques, and new informatics tools have facilitated a return to high quality assembly. We will discuss the capabilities of the technologies and assess their impact on assembly projects across the tree of life from small microbial and fungal genomes through large plant and animal genomes. Beyond improvements to contiguity, we will focus on the additional biological insights that can be made with better assemblies, including more complete analysis genes in their flanking regulatory context, in-depth studies of transposable elements and other complex gene families, and long-range synteny analysis of entire chromosomes. We will also discuss the need for new algorithms for representing and analyzing collections of many complete genomes at once.


June 1, 2021  |  

Long-read assembly of the Aedes aegypti Aag2 cell line genome resolves ancient endogenous viral elements

Transmission of arboviruses such as Dengue Virus by Aedes aegypti causes debilitating disease across the globe. Disease in humans can include severe acute symptoms such as hemorrhagic fever and organ failure, but mosquitoes tolerate high titers of virus in a persistent infection. The mechanisms responsible for this viral tolerance are unclear. Recent publications highlighted the integration of genetic material from non-retroviral RNA viruses into the genome of the host during infection that relies upon endogenous retro-transcriptase activity from transposons. These endogenous viral elements (EVEs) found in the genome are predicted to be ancient, and at least some EVEs are under purifying selection, suggesting they are beneficial to the host. To characterize EVE biogenesis in a tractable system, we sequenced the Ae. aegypti cell line, Aag2, to 58-fold coverage and present a de novo assembly of the genome. The assembly contains 1.7 Gb of genomic and 255 Mb of alternative haplotype specific sequence, consisting of contigs with a N50 of 1.4 Mb; a value that, when compared with other assemblies of the Aedes genus, is from 1-3 orders of magnitude longer. The Aag2 genome is highly repetitive (70%), most of which is classified as transposable elements (60%). We identify EVEs in the genome homologous to a range of extant viruses, many of which cluster in these regions of repetitive DNA. The contiguous assembly allows for more comprehensive identification of the transposable elements and EVEs that are most likely to be lost in assemblies lacking the read length of SMRT Sequencing.


June 1, 2021  |  

Long-read assembly of the Aedes aegypti Aag2 cell line genome resolves ancient endogenous viral elements

Transmission of arboviruses such as Dengue and Zika viruses by Aedes aegypti causes widespread and debilitating disease across the globe. Disease in humans can include severe acute symptoms such as hemorrhagic fever, organ failure, and encephalitis; and yet, mosquitoes tolerate high titers of virus in a persistent infection. The mechanisms responsible for tolerance to viral infection in mosquitoes are still unclear. Recent publications have highlighted the integration of genetic material from non-retroviral RNA viruses into the genome of the host during infection that relies upon endogenous retro-transcriptase activity from transposons. These endogenous viral elements (EVEs) found in the genome are predicted to be ancient and at least some EVEs are under purifying selection, which suggests that they are beneficial to the host. In order characterize EVE biogenesis in a tractable system we sequenced the Ae. aegypti cell line, Aag2, to 58X coverage and here present a de novo assembly of the genome. The assembly consists of 1.7 Gb of genomic and 255 Mb of alternative haplotype specific sequence, made up of contigs with a N50 of 1.4 Mb; a value that, when compared with other assemblies of the Aedes genus, is from 1-3 orders of magnitude longer. The Aag2 genome is highly repetitive (70%), most of which is classified as transposable elements (60%). We identify a plethora of EVEs in the genome homologous to a diverse range of extant viruses, many of which cluster in these regions of highly repetitive DNA. The highly contiguous nature of this assembly allows for a more comprehensive identification of the transposable elements and EVEs that are most likely to be lost in assemblies lacking the read length of SMRT Sequencing. Transmission of arboviruses such as Dengue Virus by Aedes aegypti causes widespread and debilitating disease across the globe. Disease in humans can include severe acute symptoms such as hemorrhagic fever, organ failure, and encephalitis; and yet, mosquitoes tolerate high titers of virus in a persistent infection. The mechanisms responsible for tolerance to viral infection in mosquitoes are still unclear. Recent publications have highlighted the integration of genetic material from non-retroviral RNA viruses into the genome of the host during infection that relies upon endogenous retro-transcriptase activity from transposons. These endogenous viral elements (EVEs) found in the genome are predicted to be ancient and at least some EVEs are under purifying selection, which suggests that they are beneficial to the host. In order characterize EVE biogenesis in a tractable system we sequenced the Ae. aegypti cell line, Aag2, to 58X coverage and here present a de novo assembly of the genome. The assembly consists of 1.7 Gb of genomic and 255 Mb of alternative haplotype specific sequence, made up of contigs with a N50 of 1.4 Mb; a value that, when compared with other assemblies of the Aedes genus, is from 1-3 orders of magnitude longer. The Aag2 genome is highly repetitive (70%), most of which is classified as transposable elements (60%). We identify a plethora of EVEs in the genome homologous to a diverse range of extant viruses, many of which cluster in these regions of highly repetitive DNA. The highly contiguous nature of this assembly allows for a more comprehensive identification of the transposable elements and EVEs that are most likely to be lost in assemblies lacking the read length of SMRT Sequencing. Transmission of arboviruses such as Dengue Virus by Aedes aegypti causes widespread and debilitating disease across the globe. Disease in humans can include severe acute symptoms such as hemorrhagic fever, organ failure, and encephalitis; and yet, mosquitoes tolerate high titers of virus in a persistent infection. The mechanisms responsible for tolerance to viral infection in mosquitoes are still unclear.


April 21, 2020  |  

Tracking short-term changes in the genetic diversity and antimicrobial resistance of OXA-232-producing Klebsiella pneumoniae ST14 in clinical settings.

To track stepwise changes in genetic diversity and antimicrobial resistance in rapidly evolving OXA-232-producing Klebsiella pneumoniae ST14, an emerging carbapenem-resistant high-risk clone, in clinical settings.Twenty-six K. pneumoniae ST14 isolates were collected by the Korean Nationwide Surveillance of Antimicrobial Resistance system over the course of 1 year. Isolates were subjected to whole-genome sequencing and MIC determinations using 33 antibiotics from 14 classes.Single-nucleotide polymorphism (SNP) typing identified 72 unique SNP sites spanning the chromosomes of the isolates, dividing them into three clusters (I, II and III). The initial isolate possessed two plasmids with 18 antibiotic-resistance genes, including blaOXA-232, and exhibited resistance to 11 antibiotic classes. Four other plasmids containing 12 different resistance genes, including blaCTX-M-15 and strA/B, were introduced over time, providing additional resistance to aztreonam and streptomycin. Moreover, chromosomal integration of insertion sequence Ecp1-blaCTX-M-15 mediated the inactivation of mgrB responsible for colistin resistance in four isolates from cluster III. To the best of our knowledge, this is the first description of K. pneumoniae ST14 resistant to both carbapenem and colistin in South Korea. Furthermore, although some acquired genes were lost over time, the retention of 12 resistance genes and inactivation of mgrB provided resistance to 13 classes of antibiotics.We describe stepwise changes in OXA-232-producing K. pneumoniae ST14 in vivo over time in terms of antimicrobial resistance. Our findings contribute to our understanding of the evolution of emerging high-risk K. pneumoniae clones and provide reference data for future outbreaks.Copyright © 2019 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.


April 21, 2020  |  

Evolution of a 72-kb cointegrant, conjugative multiresistance plasmid from early community-associated methicillin-resistant Staphylococcus aureus isolates.

Horizontal transfer of plasmids encoding antimicrobial-resistance and virulence determinants has been instrumental in Staphylococcus aureus evolution, including the emergence of community-associated methicillin-resistant S. aureus (CA-MRSA). In the early 1990s the first CA-MRSA isolated in Western Australia (WA), WA-5, encoded cadmium, tetracycline and penicillin-resistance genes on plasmid pWBG753 (~30 kb). WA-5 and pWBG753 appeared only briefly in WA, however, fusidic-acid-resistance plasmids related to pWBG753 were also present in the first European CA-MRSA at the time. Here we characterized a 72-kb conjugative plasmid pWBG731 present in multiresistant WA-5-like clones from the same period. pWBG731 was a cointegrant formed from pWBG753 and a pWBG749-family conjugative plasmid. pWBG731 carried mupirocin, trimethoprim, cadmium and penicillin-resistance genes. The stepwise evolution of pWBG731 likely occurred through the combined actions of IS257, IS257-dependent miniature inverted-repeat transposable elements (MITEs) and the BinL resolution system of the ß-lactamase transposon Tn552 An evolutionary intermediate ~42-kb non-conjugative plasmid pWBG715, possessed the same resistance genes as pWBG731 but retained an integrated copy of the small tetracycline-resistance plasmid pT181. IS257 likely facilitated replacement of pT181 with conjugation genes on pWBG731, thus enabling autonomous transfer. Like conjugative plasmid pWBG749, pWBG731 also mobilized non-conjugative plasmids carrying oriT mimics. It seems likely that pWBG731 represents the product of multiple recombination events between the WA-5 pWBG753 plasmid and other mobile genetic elements present in indigenous CA-MSSA. The molecular evolution of pWBG731 saliently illustrates how diverse mobile genetic elements can together facilitate rapid accrual and horizontal dissemination of multiresistance in S. aureus CA-MRSA.Copyright © 2019 American Society for Microbiology.


April 21, 2020  |  

The Chinese chestnut genome: a reference for species restoration

Forest tree species are increasingly subject to severe mortalities from exotic pests, diseases, and invasive organisms, accelerated by climate change. Forest health issues are threatening multiple species and ecosystem sustainability globally. While sources of resistance may be available in related species, or among surviving trees, introgression of resistance genes into threatened tree species in reasonable time frames requires genome-wide breeding tools. Asian species of chestnut (Castanea spp.) are being employed as donors of disease resistance genes to restore native chestnut species in North America and Europe. To aid in the restoration of threatened chestnut species, we present the assembly of a reference genome with chromosome-scale sequences for Chinese chestnut (C. mollissima), the disease-resistance donor for American chestnut restoration. We also demonstrate the value of the genome as a platform for research and species restoration, including new insights into the evolution of blight resistance in Asian chestnut species, the locations in the genome of ecologically important signatures of selection differentiating American chestnut from Chinese chestnut, the identification of candidate genes for disease resistance, and preliminary comparisons of genome organization with related species.


April 21, 2020  |  

Characterization of LINE-1 transposons in a human genome at allelic resolution

The activity of the retrotransposon LINE-1 has created a substantial portion of the human genome. Most of this sequence comprises fractured and debilitated LINE-1s. An accurate approximation of the number, location, and sequence of the LINE-1 elements present in any single genome has proven elusive due to the difficulty of assembling and phasing the repetitive and polymorphic regions of the human genome. Through an in-depth analysis of publicly-available, deep, long-read assemblies of nearly homozygous human genomes, we defined the location and sequence of all intact LINE-1s in these assemblies. We found 148 and 142 intact LINE-1s in two nearly homozygous assemblies. A combination of these assemblies suggests a diploid human genome contains at least 50% more intact LINE-1s than previous estimates textendash in this case, 290 intact LINE-1s at 194 loci. We think this is the best approximation, to date, of the number of intact LINE-1s in a single diploid human genome. In addition to counting intact LINE-1 elements, we resolved the sequence of each element, including some LINE-1 elements in unassembled, presumably centromeric regions of the genome. A comparison of the intact LINE-1s in each assembly shows the specific pattern of variation between these genomes, including LINE-1s that remain intact in only one genome, allelic variation in shared intact LINE-1s, and LINE-1s that are unique (presumably young) insertions in only one genome. We found that many old elements (> 6 million years old) remain intact, and comparison of the young and intact LINE-1s across assemblies reinforces the notion that only a small portion of all LINE-1 sequences that may be intact in the genomes of the human population has been uncovered. This dataset provides the first nearly comprehensive estimate of LINE-1 diversity within an individual, an important dataset in the quest to understand the functional consequences of sequence variation in LINE-1 and the complete set of LINE-1s in the human population.


April 21, 2020  |  

Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline

Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and allow for annotation of TEs. There are numerous methods for each class of elements with unknown relative performance metrics. We benchmarked existing programs based on a curated library of rice TEs. Using the most robust programs, we created a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a condensed TE library for annotations of structurally intact and fragmented elements. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.List of abbreviationsTETransposable ElementsLTRLong Terminal RepeatLINELong Interspersed Nuclear ElementSINEShort Interspersed Nuclear ElementMITEMiniature Inverted Transposable ElementTIRTerminal Inverted RepeatTSDTarget Site DuplicationTPTrue PositivesFPFalse PositivesTNTrue NegativeFNFalse NegativesGRFGeneric Repeat FinderEDTAExtensive de-novo TE Annotator


April 21, 2020  |  

Genome data of Fusarium oxysporum f. sp. cubense race 1 and tropical race 4 isolates using long-read sequencing.

Fusarium wilt of banana is caused by the soil-borne fungal pathogen Fusarium oxysporum f. sp. cubense (Foc). We generated two chromosome-level assemblies of Foc race 1 and tropical race 4 strains using single-molecule real-time sequencing. The Foc1 and FocTR4 assemblies had 35 and 29 contigs with contig N50 lengths of 2.08 Mb and 4.28 Mb, respectively. These two new references genomes represent a greater than 100-fold improvement over the contig N50 statistics of the previous short read-based Foc assemblies. The two high-quality assemblies reported here will be a valuable resource for the comparative analysis of Foc races at the pathogenic levels.


April 21, 2020  |  

Insect genomes: progress and challenges.

In the wake of constant improvements in sequencing technologies, numerous insect genomes have been sequenced. Currently, 1219 insect genome-sequencing projects have been registered with the National Center for Biotechnology Information, including 401 that have genome assemblies and 155 with an official gene set of annotated protein-coding genes. Comparative genomics analysis showed that the expansion or contraction of gene families was associated with well-studied physiological traits such as immune system, metabolic detoxification, parasitism and polyphagy in insects. Here, we summarize the progress of insect genome sequencing, with an emphasis on how this impacts research on pest control. We begin with a brief introduction to the basic concepts of genome assembly, annotation and metrics for evaluating the quality of draft assemblies. We then provide an overview of genome information for numerous insect species, highlighting examples from prominent model organisms, agricultural pests and disease vectors. We also introduce the major insect genome databases. The increasing availability of insect genomic resources is beneficial for developing alternative pest control methods. However, many opportunities remain for developing data-mining tools that make maximal use of the available insect genome resources. Although rapid progress has been achieved, many challenges remain in the field of insect genomics. © 2019 The Royal Entomological Society.


April 21, 2020  |  

Full-length mRNA sequencing and gene expression profiling reveal broad involvement of natural antisense transcript gene pairs in pepper development and response to stresses.

Pepper is an important vegetable with great economic value and unique biological features. In the past few years, significant development has been made towards understanding the huge complex pepper genome; however, pepper functional genomics has not been well studied. To better understand the pepper gene structure and pepper gene regulation, we conducted full-length mRNA sequencing by PacBio sequencing and obtained 57862 high-quality full-length mRNA sequences derived from 18362 previously annotated and 5769 newly detected genes. New gene models were built that combined the full-length mRNA sequences and corrected approximately 500 fragmented gene models from previous annotations. Based on the full-length mRNA, we identified 4114 and 5880 pepper genes forming natural antisense transcript (NAT) genes in-cis and in-trans, respectively. Most of these genes accumulate small RNAs in their overlapping regions. By analyzing these NAT gene expression patterns in our transcriptome data, we identified many NAT pairs responsive to a variety of biological processes in pepper. Pepper formate dehydrogenase 1 (FDH1), which is required for R-gene-mediated disease resistance, may be regulated by nat-siRNAs and participate in a positive feedback loop in salicylic acid biosynthesis during resistance responses. Several cis-NAT pairs and subgroups of trans-NAT genes were responsive to pepper pericarp and placenta development, which may play roles in capsanthin and capsaicin biosynthesis. Using a comparative genomics approach, the evolutionary mechanisms of cis-NATs were investigated, and we found that an increase in intergenic sequences accounted for the loss of most cis-NATs, while transposon insertion contributed to the formation of most new cis-NATs. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.


April 21, 2020  |  

Pandemic spread of blaKPC-2 among Klebsiella pneumoniae ST11 in China is associated with horizontal transfer mediated by IncFII-like plasmids.

This study aimed to investigate the spread of the blaKPC-2 gene among Klebsiella pneumoniae and to illustrate the mechanism of dissemination of KPC-producing K. pneumoniae (KPC-Kp) ST11 in China.A total of 354 K. pneumoniae isolates were collected from four hospitals in China and were characterized by Multilocus sequence typing (MLST). Mobile genetic elements (MGEs) and pulsed-field gel electrophoresis (PFGE) analysis were used to identify the subtypes of K. pneumoniae ST11. PCR-based amplification and sequencing were performed to analyze Tn1721 transposons and IncFII-like plasmids. Electroporation experiments and whole-genome sequencing (WGS) analysis were used to reveal the genetic environment of the blaKPC-2 gene.As the primary type(87.1%) of KPC-Kp, K. pneumoniae ST11 was not predominant in nonKPC-Kp(3.1%). ST11 KPC-Kp was clonally heterogeneous and could be further classified into eleven MGE types and fourteen PFGE subtypes. Five Tn1721-blaKPC-2 variants were identified on IncFII-like plasmids. The detection rate of IncFII-like plasmids was much higher in ST11 KPC-Kp (100%) compared with non-ST11 KPC-Kp (16.0%) and the nonKPC-Kp group (7.5%). Moreover, the IncFII plasmid (with IIa replicon) was primarily detected on the MGE-F type (61.7%). The IncFIIk plasmid (with IIk replicon) was clustered into two subtypes: MGE-A (28.3%) and -F (41.5%). The detection of the IncFII and IncFIIk plasmids on MGE-A was 57.1% (20/35) and 42.9% (15/35), respectively.We revealed a close correlation between ST11 KPC-Kp and IncFII-like plasmids. Horizontal transfer mediated by IncFII-like plasmids plays an important role in the pandemic expansion of blaKPC-2 among K. pneumoniae ST11 in China. Copyright © 2019. Published by Elsevier B.V.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.