Menu
July 19, 2019  |  

Complete genome sequence of Vibrio campbellii strain 20130629003S01 isolated from shrimp with acute hepatopancreatic necrosis disease.

Vibrio campbellii is widely distributed in the marine environment and is an important pathogen of aquatic organisms such as shrimp, fish, and mollusks. An isolate of V. campbellii carrying the pirAB(vp) gene, causing acute hepatopancreatic necrosis disease (AHPND), has been reported. There are no previous reports about the complete genome of V. campbellii causing AHPND (VCAHPND). To extend our understanding of the pathogenesis of VCAHPND at the genomic level, the genome of V. campbellii 20130629003S01 isolated from a shrimp with AHPND was sequenced and analysed.The complete genome sequence of V. campbellii 20130629003S01 was generated using the PacBio RSII platform with single molecule, real-time sequencing. The 20130629003S01 strain consists of two circular chromosomes (3,621,712 bp in chromosome 1 and 2,245,751 bp in chromosome 2) and four plasmids of 70,066, 204,531, 143,140, and 86,121 bp. The genome contains a total of 5855 protein coding genes, 134 tRNA genes and 37 rRNA genes. The average nucleotide identity value of 20130629003S01 and other reference V. campbellii strains was 97.46%, suggesting that they are closely related.The genome sequence of V. campbellii 20130629003S01 and its comparative analysis with other V. campbellii strains that we present here are important for a better understanding of the genomic characteristics of VCAHPND.


July 19, 2019  |  

Improved maize reference genome with single-molecule technologies.

Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate the determination of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions. Here we report the assembly and annotation of a reference genome of maize, a genetic and agricultural model species, using single-molecule real-time sequencing and high-resolution optical mapping. Relative to the previous reference genome, our assembly features a 52-fold increase in contig length and notable improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed more than 130,000 intact transposable elements, allowing us to identify transposable element lineage expansions that are unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by single-molecule real-time sequencing. In addition, comparative optical mapping of two other inbred maize lines revealed a prevalence of deletions in regions of low gene density and maize lineage-specific genes.


July 19, 2019  |  

Quality control of the traditional patent medicine Yimu Wan based on SMRT Sequencing and DNA barcoding.

Substandard traditional patent medicines may lead to global safety-related issues. Protecting consumers from the health risks associated with the integrity and authenticity of herbal preparations is of great concern. Of particular concern is quality control for traditional patent medicines. Here, we establish an effective approach for verifying the biological composition of traditional patent medicines based on single-molecule real-time (SMRT) sequencing and DNA barcoding. Yimu Wan (YMW), a classical herbal prescription recorded in the Chinese Pharmacopoeia, was chosen to test the method. Two reference YMW samples were used to establish a standard method for analysis, which was then applied to three different batches of commercial YMW samples. A total of 3703 and 4810 circular-consensus sequencing (CCS) reads from two reference and three commercial YMW samples were mapped to the ITS2 and psbA-trnH regions, respectively. Moreover, comparison of intraspecific genetic distances based on SMRT sequencing data with reference data from Sanger sequencing revealed an ITS2 and psbA-trnH intergenic spacer that exhibited high intraspecific divergence, with the sites of variation showing significant differences within species. Using the CCS strategy for SMRT sequencing analysis was adequate to guarantee the accuracy of identification. This study demonstrates the application of SMRT sequencing to detect the biological ingredients of herbal preparations. SMRT sequencing provides an affordable way to monitor the legality and safety of traditional patent medicines.


July 19, 2019  |  

Reduction in chromosome mobility accompanies nuclear organization during early embryogenesis in Caenorhabditis elegans.

In differentiated cells, chromosomes are packed inside the cell nucleus in an organised fashion. In contrast, little is known about how chromosomes are packed in undifferentiated cells and how nuclear organization changes during development. To assess changes in nuclear organization during the earliest stages of development, we quantified the mobility of a pair of homologous chromosomal loci in the interphase nuclei of Caenorhabditis elegans embryos. The distribution of distances between homologous loci was consistent with a random distribution up to the 8-cell stage but not at later stages. The mobility of the loci was significantly reduced from the 2-cell to the 48-cell stage. Nuclear foci corresponding to epigenetic marks as well as heterochromatin and the nucleolus also appeared around the 8-cell stage. We propose that the earliest global transformation in nuclear organization occurs at the 8-cell stage during C. elegans embryogenesis.


July 19, 2019  |  

Characterization of a large antibiotic resistance plasmid found in enteropathogenic Escherichia coli strain B171 and its relatedness to plasmids of diverse E. coli and Shigella.

Enteropathogenic Escherichia coli (EPEC) is a leading cause of severe infantile diarrhea in developing countries. Previous research has focused on the diversity of the EPEC virulence plasmid, whereas less is known regarding the genetic content and distribution of antibiotic resistance plasmids carried by EPEC. A previous study demonstrated that in addition to the virulence plasmid, reference EPEC strain B171 harbors a second, larger plasmid that confers antibiotic resistance. To further understand the genetic diversity and dissemination of antibiotic resistance plasmids among EPEC strains, we describe the complete sequence of an antibiotic resistance plasmid from EPEC strain B171. The resistance plasmid, pB171_90, has a completed sequence length of 90,229 bp, a GC content of 54.55%, and carries protein-encoding genes involved in conjugative transfer, resistance to tetracycline (tetA), sulfonamides (sulI), and mercury, as well as several virulence-associated genes, including the transcriptional regulator hha and the putative calcium sequestration inhibitor (csi). In silico detection of the pB171_90 genes among 4,798 publicly available E. coli genome assemblies indicates that the unique genes of pB171_90 (csi and traI) are primarily restricted to genomes identified as EPEC or enterotoxigenic E. coli However, conserved regions of the pB171_90 plasmid containing genes involved in replication, stability, and antibiotic resistance were identified among diverse E. coli pathotypes. Interestingly, pB171_90 also exhibited significant similarity with a sequenced plasmid from Shigella dysenteriae type I. Our findings demonstrate the mosaic nature of EPEC antibiotic resistance plasmids and highlight the need for additional sequence-based characterization of antibiotic resistance plasmids harbored by pathogenic E. coli. Copyright © 2017 American Society for Microbiology.


July 19, 2019  |  

The complete genome sequence of the phytopathogenic fungus Sclerotinia sclerotiorum reveals insights into the genome architecture of broad host range pathogens.

Sclerotinia sclerotiorum is a phytopathogenic fungus with over 400 hosts including numerous economically important cultivated species. This contrasts many economically destructive pathogens that only exhibit a single or very few hosts. Many plant pathogens exhibit a “two-speed” genome. So described because their genomes contain alternating gene rich, repeat sparse and gene poor, repeat-rich regions. In fungi, the repeat-rich regions may be subjected to a process termed repeat-induced point mutation (RIP). Both repeat activity and RIP are thought to play a significant role in evolution of secreted virulence proteins, termed effectors. We present a complete genome sequence of S. sclerotiorum generated using Single Molecule Real-Time Sequencing technology with highly accurate annotations produced using an extensive RNA sequencing data set. We identified 70 effector candidates and have highlighted their in planta expression profiles. Furthermore, we characterized the genome architecture of S. sclerotiorum in comparison to plant pathogens that exhibit “two-speed” genomes. We show that there is a significant association between positions of secreted proteins and regions with a high RIP index in S. sclerotiorum but we did not detect a correlation between secreted protein proportion and GC content. Neither did we detect a negative correlation between CDS content and secreted protein proportion across the S. sclerotiorum genome. We conclude that S. sclerotiorum exhibits subtle signatures of enhanced mutation of secreted proteins in specific genomic compartments as a result of transposition and RIP activity. However, these signatures are not observable at the whole-genome scale.


July 19, 2019  |  

How Single Molecule Real-Time Sequencing and haplotype phasing have enabled reference-grade diploid genome assembly of wine grapes.

Domesticated grapevines (Vitis vinifera) have relatively small genomes of about 500 Mb (Lodhi and Reisch, 1995; Jaillon et al., 2007; Velasco et al., 2007), which is similar to other small-genomes species like rice (430 Mb; Goff et al., 2002), medicago (500 Mb; Tang et al., 2014), and poplar (465 Mb; Tuskan et al., 2006). Despite their small genome size, the sequencing and assembling of grapevine genomes is difficult because of high levels of heterozygosity. The high heterozygosity in domesticated grapes may be due, in part, to their domestication from an obligately outcrossing, dioecious wild progenitor. Domesticated grapes can be selfed, in theory, because their mating system transitioned to hermaphroditic, self-fertile flowers during domestication. In practice, however, selfed progeny tend to be non-viable, presumably due to a high deleterious recessive load and resulting inbreeding depression. As a consequence of these fitness effects, most grape cultivars are crosses between distantly related parents (Strefeler et al., 1992; Ohmi et al., 1993; Bowers and Meredith, 1997; Sefc et al., 1998; Lopes et al., 1999; Di Gaspero et al., 2005; Tapia et al., 2007; Ibáñez et al., 2009; Cipriani et al., 2010; Myles et al., 2011; Lacombe et al., 2013).


July 19, 2019  |  

First report of two complete Clostridium chauvoei genome sequences and detailed in silico genome analysis.

Clostridium (C.) chauvoei is a Gram-positive, spore forming, anaerobic bacterium. It causes black leg in ruminants, a typically fatal histotoxic myonecrosis. High quality circular genome sequences were generated for the C. chauvoei type strain DSM 7528(T) (ATCC 10092(T)) and a field strain 12S0467 isolated in Germany. The origin of replication (oriC) was comparable to that of Bacillus subtilis in structure with two regions containing DnaA boxes. Similar prophages were identified in the genomes of both C. chauvoei strains which also harbored hemolysin and bacterial spore formation genes. A CRISPR type I-B system with limited variations in the repeat number was identified. Sporulation and germination process related genes were homologous to that of the Clostridia cluster I group but novel variations for regulatory genes were identified indicative for strain specific control of regulatory events. Phylogenomics showed a higher relatedness to C. septicum than to other so far sequenced genomes of species belonging to the genus Clostridium. Comparative genome analysis of three C. chauvoei circular genome sequences revealed the presence of few inversions and translocations in locally collinear blocks (LCBs). The species genome also shows a large number of genes involved in proteolysis, genes for glycosyl hydrolases and metal iron transportation genes which are presumably involved in virulence and survival in the host. Three conserved flagellar genes (fliC) were identified in each of the circular genomes. In conclusion this is the first comparative analysis of circular genomes for the species C. chauvoei, enabling insights into genome composition and virulence factor variation. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.


July 19, 2019  |  

A case study into microbial genome assembly gap sequences and finishing strategies.

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.


July 19, 2019  |  

The draft genome of tropical fruit durian (Durio zibethinus).

Durian (Durio zibethinus) is a Southeast Asian tropical plant known for its hefty, spine-covered fruit and sulfury and onion-like odor. Here we present a draft genome assembly of D. zibethinus, representing the third plant genus in the Malvales order and first in the Helicteroideae subfamily to be sequenced. Single-molecule sequencing and chromosome contact maps enabled assembly of the highly heterozygous durian genome at chromosome-scale resolution. Transcriptomic analysis showed upregulation of sulfur-, ethylene-, and lipid-related pathways in durian fruits. We observed paleopolyploidization events shared by durian and cotton and durian-specific gene expansions in MGL (methionine ?-lyase), associated with production of volatile sulfur compounds (VSCs). MGL and the ethylene-related gene ACS (aminocyclopropane-1-carboxylic acid synthase) were upregulated in fruits concomitantly with their downstream metabolites (VSCs and ethylene), suggesting a potential association between ethylene biosynthesis and methionine regeneration via the Yang cycle. The durian genome provides a resource for tropical fruit biology and agronomy.


July 19, 2019  |  

De novo PacBio long-read and phased avian genome assemblies correct and add to reference genes generated with intermediate and short reads.

Reference-quality genomes are expected to provide a resource for studying gene structure, function, and evolution. However, often genes of interest are not completely or accurately assembled, leading to unknown errors in analyses or additional cloning efforts for the correct sequences. A promising solution is long-read sequencing. Here we tested PacBio-based long-read sequencing and diploid assembly for potential improvements to the Sanger-based intermediate-read zebra finch reference and Illumina-based short-read Anna’s hummingbird reference, 2 vocal learning avian species widely studied in neuroscience and genomics. With DNA of the same individuals used to generate the reference genomes, we generated diploid assemblies with the FALCON-Unzip assembler, resulting in contigs with no gaps in the megabase range, representing 150-fold and 200-fold improvements over the current zebra finch and hummingbird references, respectively. These long-read and phased assemblies corrected and resolved what we discovered to be numerous misassemblies in the references, including missing sequences in gaps, erroneous sequences flanking gaps, base call errors in difficult-to-sequence regions, complex repeat structure errors, and allelic differences between the 2 haplotypes. These improvements were validated by single long-genome and transcriptome reads and resulted for the first time in completely resolved protein-coding genes widely studied in neuroscience and specialized in vocal learning species. These findings demonstrate the impact of long reads, sequencing of previously difficult-to-sequence regions, and phasing of haplotypes on generating the high-quality assemblies necessary for understanding gene structure, function, and evolution.© The Authors 2017. Published by Oxford University Press.


July 19, 2019  |  

The diversity, structure, and function of heritable adaptive immunity sequences in the Aedes aegypti genome.

The Aedes aegypti mosquito transmits arboviruses, including dengue, chikungunya, and Zika virus. Understanding the mechanisms underlying mosquito immunity could provide new tools to control arbovirus spread. Insects exploit two different RNAi pathways to combat viral and transposon infection: short interfering RNAs (siRNAs) and PIWI-interacting RNAs (piRNAs) [1, 2]. Endogenous viral elements (EVEs) are sequences from non-retroviral viruses that are inserted into the mosquito genome and can act as templates for the production of piRNAs [3, 4]. EVEs therefore represent a record of past infections and a reservoir of potential immune memory [5]. The large-scale organization of EVEs has been difficult to resolve with short-read sequencing because they tend to integrate into repetitive regions of the genome. To define the diversity, organization, and function of EVEs, we took advantage of the contiguity associated with long-read sequencing to generate a high-quality assembly of the Ae. aegypti-derived Aag2 cell line genome, an important and widely used model system. We show EVEs are acquired through recombination with specific classes of long terminal repeat (LTR) retrotransposons and organize into large loci (>50 kbp) characterized by high LTR density. These EVE-containing loci have increased density of piRNAs compared to similar regions without EVEs. Furthermore, we detected EVE-derived piRNAs consistent with a targeted processing of persistently infecting virus genomes. We propose that comparisons of EVEs across mosquito populations may explain differences in vector competence, and further study of the structure and function of these elements in the genome of mosquitoes may lead to epidemiological interventions. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 19, 2019  |  

The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum.

Common bread wheat, Triticum aestivum, has one of the most complex genomes known to science, with 6 copies of each chromosome, enormous numbers of near-identical sequences scattered throughout, and an overall haploid size of more than 15 billion bases. Multiple past attempts to assemble the genome have produced assemblies that were well short of the estimated genome size. Here we report the first near-complete assembly of T. aestivum, using deep sequencing coverage from a combination of short Illumina reads and very long Pacific Biosciences reads. The final assembly contains 15 344 693 583 bases and has a weighted average (N50) contig size of 232 659 bases. This represents by far the most complete and contiguous assembly of the wheat genome to date, providing a strong foundation for future genetic studies of this important food crop. We also report how we used the recently published genome of Aegilops tauschii, the diploid ancestor of the wheat D genome, to identify 4 179 762 575 bp of T. aestivum that correspond to its D genome components.© The Author 2017. Published by Oxford University Press.


July 19, 2019  |  

Long-read genome sequence assembly provides insight into ongoing retroviral invasion of the koala germline.

The koala retrovirus (KoRV) is implicated in several diseases affecting the koala (Phascolarctos cinereus). KoRV provirus can be present in the genome of koalas as an endogenous retrovirus (present in all cells via germline integration) or as exogenous retrovirus responsible for somatic integrations of proviral KoRV (present in a limited number of cells). This ongoing invasion of the koala germline by KoRV provides a powerful opportunity to assess the viral strategies used by KoRV in an individual. Analysis of a high-quality genome sequence of a single koala revealed 133 KoRV integration sites. Most integrations contain full-length, endogenous provirus; KoRV-A subtype. The second most frequent integrations contain an endogenous recombinant element (recKoRV) in which most of the KoRV protein-coding region has been replaced with an ancient, endogenous retroelement. A third set of integrations, with very low sequence coverage, may represent somatic cell integrations of KoRV-A, KoRV-B and two recently designated additional subgroups, KoRV-D and KoRV-E. KoRV-D and KoRV-E are missing several genes required for viral processing, suggesting they have been transmitted as defective viruses. Our results represent the first comprehensive analyses of KoRV integration and variation in a single animal and provide further insights into the process of retroviral-host species interactions.


July 19, 2019  |  

Centromere evolution and CpG methylation during vertebrate speciation.

Centromeres and large-scale structural variants evolve and contribute to genome diversity during vertebrate speciation. Here, we perform de novo long-read genome assembly of three inbred medaka strains that are derived from geographically isolated subpopulations and undergo speciation. Using single-molecule real-time (SMRT) sequencing, we obtain three chromosome-mapped genomes of length ~734, ~678, and ~744Mbp with a resource of twenty-two centromeric regions of length 20-345kbp. Centromeres are positionally conserved among the three strains and even between four pairs of chromosomes that were duplicated by the teleost-specific whole-genome duplication 320-350 million years ago. The centromeres do not all evolve at a similar pace; rather, centromeric monomers in non-acrocentric chromosomes evolve significantly faster than those in acrocentric chromosomes. Using methylation sensitive SMRT reads, we uncover centromeres are mostly hypermethylated but have hypomethylated sub-regions that acquire unique sequence compositions independently. These findings reveal the potential of non-acrocentric centromere evolution to contribute to speciation.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.