Menu
July 7, 2019

Genomic innovation for crop improvement.

Crop production needs to increase to secure future food supplies, while reducing its impact on ecosystems. Detailed characterization of plant genomes and genetic diversity is crucial for meeting these challenges. Advances in genome sequencing and assembly are being used to access the large and complex genomes of crops and their wild relatives. These have helped to identify a wide spectrum of genetic variation and permitted the association of genetic diversity with diverse agronomic phenotypes. In combination with improved and automated phenotyping assays and functional genomic studies, genomics is providing new foundations for crop-breeding systems.


July 7, 2019

Genomic changes associated with the evolutionary transition of an insect gut symbiont into a blood-borne pathogen.

The genus Bartonella comprises facultative intracellular bacteria with a unique lifestyle. After transmission by blood-sucking arthropods they colonize the erythrocytes of mammalian hosts causing acute and chronic infectious diseases. Although the pathogen-host interaction is well understood, little is known about the evolutionary origin of the infection strategy manifested by Bartonella species. Here we analyzed six genomes of Bartonella apis, a honey bee gut symbiont that to date represents the closest relative of pathogenic Bartonella species. Comparative genomics revealed that B. apis encodes a large set of vertically inherited genes for amino acid and cofactor biosynthesis and nitrogen metabolism. Most pathogenic bartonellae have lost these ancestral functions, but acquired specific virulence factors and expanded a vertically inherited gene family for harvesting cofactors from the blood. However, the deeply rooted pathogen Bartonella tamiae has retained many of the ancestral genome characteristics reflecting an evolutionary intermediate state toward a host-restricted intraerythrocytic lifestyle. Our findings suggest that the ancestor of the pathogen Bartonella was a gut symbiont of insects and that the adaptation to blood-feeding insects facilitated colonization of the mammalian bloodstream. This study highlights the importance of comparative genomics among pathogens and non-pathogenic relatives to understand disease emergence within an evolutionary-ecological framework.


July 7, 2019

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod.

The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies.By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual.The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.


July 7, 2019

AidP, a novel N-Acyl homoserine lactonase gene from Antarctic Planococcus sp.

Planococcus is a Gram-positive halotolerant bacterial genus in the phylum Firmicutes, commonly found in various habitats in Antarctica. Quorum quenching (QQ) is the disruption of bacterial cell-to-cell communication (known as quorum sensing), which has previously been described in mesophilic bacteria. This study demonstrated the QQ activity of a psychrotolerant strain, Planococcus versutus strain L10.15(T), isolated from a soil sample obtained near an elephant seal wallow in Antarctica. Whole genome analysis of this bacterial strain revealed the presence of an N-acyl homoserine lactonase, an enzyme that hydrolyzes the ester bond of the homoserine lactone of N-acyl homoserine lactone (AHLs). Heterologous gene expression in E. coli confirmed its functions for hydrolysis of AHLs, and the gene was designated as aidP (autoinducer degrading gene from Planococcus sp.). The low temperature activity of this enzyme suggested that it is a novel and uncharacterized class of AHL lactonase. This study is the first report on QQ activity of bacteria isolated from the polar regions.


July 7, 2019

A spontaneous mutation in kdsD, a biosynthesis gene for 3 Deoxy-D-manno-Octulosonic Acid, occurred in a ciprofloxacin resistant strain of Francisella tularensis and caused a high level of attenuation in murine models of tularemia.

Francisella tularensis, a gram-negative facultative intracellular bacterial pathogen, is the causative agent of tularemia and able to infect many mammalian species, including humans. Because of its ability to cause a lethal infection, low infectious dose, and aerosolizable nature, F. tularensis subspecies tularensis is considered a potential biowarfare agent. Due to its in vitro efficacy, ciprofloxacin is one of the antibiotics recommended for post-exposure prophylaxis of tularemia. In order to identify therapeutics that will be efficacious against infections caused by drug resistant select-agents and to better understand the threat, we sought to characterize an existing ciprofloxacin resistant (CipR) mutant in the Schu S4 strain of F. tularensis by determining its phenotypic characteristics and sequencing the chromosome to identify additional genetic alterations that may have occurred during the selection process. In addition to the previously described genetic alterations, the sequence of the CipR mutant strain revealed several additional mutations. Of particular interest was a frameshift mutation within kdsD which encodes for an enzyme necessary for the production of 3-Deoxy-D-manno-Octulosonic Acid (KDO), an integral component of the lipopolysaccharide (LPS). A kdsD mutant was constructed in the Schu S4 strain. Although it was not resistant to ciprofloxacin, the kdsD mutant shared many phenotypic characteristics with the CipR mutant, including growth defects under different conditions, sensitivity to hydrophobic agents, altered LPS profiles, and attenuation in multiple models of murine tularemia. This study demonstrates that the KdsD enzyme is essential for Francisella virulence and may be an attractive therapeutic target for developing novel medical countermeasures.


July 7, 2019

Combination of short-read, long-read and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications.

Accurate and contiguous genome assembly is key to a comprehensive understanding of the processes shaping genomic diversity and evolution. Yet, it is frequently constrained by constitutive heterochromatin, usually characterized by highly repetitive DNA. As a key feature of genome architecture associated with centromeric and telomeric regions it influences meiotic recombination. In this study, we assess the impact of large tandem repeat arrays on the recombination rate landscape in an avian speciation model, the Eurasian crow. We assembled two high-quality genome references using single-molecule real-time sequencing (long-read assembly, LR) and single-molecule restriction maps (optical map assembly, OM). A three-way comparison including the published short-read assembly (SR) constructed for the same individual allowed assessing assembly properties and pinpointing mis-assemblies. Combining information from all three assemblies, we characterized 36 previously unidentified large repetitive regions in the proximity of sequence assembly breakpoints, the majority of which contained complex arrays of a 14-kb satellite repeat or its 1.2-kb subunit. Using genome-wide population re-sequencing data, we estimated the population-scaled recombination rate (?) and found it to be significantly reduced in these regions. These findings are consistent with an effect of low recombination in regions adjacent to centromeric or subtelomeric heterochromatin, and add to our understanding of the processes generating widespread heterogeneity in genetic diversity and differentiation along the genome. By combining three independent technologies, our results highlight the importance of adding a layer of information on genome structure inaccessible to each approach independently. Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Fungal volatile compounds induce production of the secondary metabolite Sodorifen in Serratia plymuthica PRI-2C.

The ability of bacteria and fungi to communicate with each other is a remarkable aspect of the microbial world. It is recognized that volatile organic compounds (VOCs) act as communication signals, however the molecular responses by bacteria to fungal VOCs remain unknown. Here we perform transcriptomics and proteomics analyses of Serratia plymuthica PRI-2C exposed to VOCs emitted by the fungal pathogen Fusarium culmorum. We find that the bacterium responds to fungal VOCs with changes in gene and protein expression related to motility, signal transduction, energy metabolism, cell envelope biogenesis, and secondary metabolite production. Metabolomic analysis of the bacterium exposed to the fungal VOCs, gene cluster comparison, and heterologous co-expression of a terpene synthase and a methyltransferase revealed the production of the unusual terpene sodorifen in response to fungal VOCs. These results strongly suggest that VOCs are not only a metabolic waste but important compounds in the long-distance communication between fungi and bacteria.


July 7, 2019

Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence.

During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays, although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3.© 2017 Maheshwari et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm.

Long sequencing reads generated by single-molecule sequencing technology offer the possibility of dramatically improving the contiguity of genome assemblies. The biggest challenge today is that long reads have relatively high error rates, currently around 15%. The high error rates make it difficult to use this data alone, particularly with highly repetitive plant genomes. Errors in the raw data can lead to insertion or deletion errors (indels) in the consensus genome sequence, which in turn create significant problems for downstream analysis; for example, a single indel may shift the reading frame and incorrectly truncate a protein sequence. Here, we describe an algorithm that solves the high error rate problem by combining long, high-error reads with shorter but much more accurate Illumina sequencing reads, whose error rates average <1%. Our hybrid assembly algorithm combines these two types of reads to construct mega-reads, which are both long and accurate, and then assembles the mega-reads using the CABOG assembler, which was designed for long reads. We apply this technique to a large data set of Illumina and PacBio sequences from the species Aegilops tauschii, a large and extremely repetitive plant genome that has resisted previous attempts at assembly. We show that the resulting assembled contigs are far larger than in any previous assembly, with an N50 contig size of 486,807 nucleotides. We compare the contigs to independently produced optical maps to evaluate their large-scale accuracy, and to a set of high-quality bacterial artificial chromosome (BAC)-based assemblies to evaluate base-level accuracy. © 2017 Zimin et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism.

Plasmopara viticola causes downy mildew disease of grapevine which is one of the most devastating diseases of viticulture worldwide. Here we report a 101.3?Mb whole genome sequence of P. viticola isolate ‘JL-7-2’ obtained by a combination of Illumina and PacBio sequencing technologies. The P. viticola genome contains 17,014 putative protein-coding genes and has ~26% repetitive sequences. A total of 1,301 putative secreted proteins, including 100 putative RXLR effectors and 90 CRN effectors were identified in this genome. In the secretome, 261 potential pathogenicity genes and 95 carbohydrate-active enzymes were predicted. Transcriptional analysis revealed that most of the RXLR effectors, pathogenicity genes and carbohydrate-active enzymes were significantly up-regulated during infection. Comparative genomic analysis revealed that P. viticola evolved independently from the Arabidopsis downy mildew pathogen Hyaloperonospora arabidopsidis. The availability of the P. viticola genome provides a valuable resource not only for comparative genomic analysis and evolutionary studies among oomycetes, but also enhance our knowledge on the mechanism of interactions between this biotrophic pathogen and its host.


July 7, 2019

Terpene synthases from Cannabis sativa.

Cannabis (Cannabis sativa) plants produce and accumulate a terpene-rich resin in glandular trichomes, which are abundant on the surface of the female inflorescence. Bouquets of different monoterpenes and sesquiterpenes are important components of cannabis resin as they define some of the unique organoleptic properties and may also influence medicinal qualities of different cannabis strains and varieties. Transcriptome analysis of trichomes of the cannabis hemp variety ‘Finola’ revealed sequences of all stages of terpene biosynthesis. Nine cannabis terpene synthases (CsTPS) were identified in subfamilies TPS-a and TPS-b. Functional characterization identified mono- and sesqui-TPS, whose products collectively comprise most of the terpenes of ‘Finola’ resin, including major compounds such as ß-myrcene, (E)-ß-ocimene, (-)-limonene, (+)-a-pinene, ß-caryophyllene, and a-humulene. Transcripts associated with terpene biosynthesis are highly expressed in trichomes compared to non-resin producing tissues. Knowledge of the CsTPS gene family may offer opportunities for selection and improvement of terpene profiles of interest in different cannabis strains and varieties.


July 7, 2019

Genetic and genomic tools for Cannabis sativa

The Cannabis industry is currently one of the fastest growing industries in the United States. Given the changing legal status of the plant, and the rapidly advancing research, updated information on the advancement of Cannabis genomics is needed. This versatile plant is used as medicine and for food, fiber, and bioremediation. Insights from modern, high-throughput genomic technology are revolutionizing our understanding of the plant and are providing new tools to further improve our knowledge and utilization of this unique species. This review quantifies and evaluates the currently available genomic resources for Cannabis research, including six whole-genome assemblies, two transcriptomes, and 393 other substantial genomic resources, as well as other smaller publicly available genetic and genomic resources. The open-source approaches followed by many leading scientists in the field promote collaboration and facilitate these rapid advances.


July 7, 2019

An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing.

The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25?361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107?821, 61% larger than the previous assembly. © The Author 2017. Published by Oxford University Press.


July 7, 2019

Extremely low genomic diversity of Rickettsia japonica distributed in Japan.

Rickettsiae are obligate intracellular bacteria that have small genomes as a result of reductive evolution. Many Rickettsia species of the spotted fever group (SFG) cause tick-borne diseases known as “spotted fevers”. The life cycle of SFG rickettsiae is closely associated with that of the tick, which is generally thought to act as a bacterial vector and reservoir that maintains the bacterium through transstadial and transovarial transmission. Each SFG member is thought to have adapted to a specific tick species, thus restricting the bacterial distribution to a relatively limited geographic region. These unique features of SFG rickettsiae allow investigation of how the genomes of such biologically and ecologically specialized bacteria evolve after genome reduction and the types of population structures that are generated. Here, we performed a nationwide, high-resolution phylogenetic analysis of Rickettsia japonica, an etiological agent of Japanese spotted fever that is distributed in Japan and Korea. The comparison of complete or nearly complete sequences obtained from 31 R. japonica strains isolated from various sources in Japan over the past 30 years demonstrated an extremely low level of genomic diversity. In particular, only 34 single nucleotide polymorphisms were identified among the 27 strains of the major lineage containing all clinical isolates and tick isolates from the three tick species. Our data provide novel insights into the biology and genome evolution of R. japonica, including the possibilities of recent clonal expansion and a long generation time in nature due to the long dormant phase associated with tick life cycles.© The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

The Nephila clavipes genome highlights the diversity of spider silk genes and their complex expression.

Spider silks are the toughest known biological materials, yet are lightweight and virtually invisible to the human immune system, and they thus have revolutionary potential for medicine and industry. Spider silks are largely composed of spidroins, a unique family of structural proteins. To investigate spidroin genes systematically, we constructed the first genome of an orb-weaving spider: the golden orb-weaver (Nephila clavipes), which builds large webs using an extensive repertoire of silks with diverse physical properties. We cataloged 28 Nephila spidroins, representing all known orb-weaver spidroin types, and identified 394 repeated coding motif variants and higher-order repetitive cassette structures unique to specific spidroins. Characterization of spidroin expression in distinct silk gland types indicates that glands can express multiple spidroin types. We find evidence of an alternatively spliced spidroin, a spidroin expressed only in venom glands, evolutionary mechanisms for spidroin diversification, and non-spidroin genes with expression patterns that suggest roles in silk production.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.