Menu
April 21, 2020  |  

A robust benchmark for germline structural variant detection

New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution, and comprehensiveness. Translating these methods to routine research and clinical practice requires robust benchmark sets. We developed the first benchmark set for identification of both false negative and false positive germline SVs, which complements recent efforts emphasizing increasingly comprehensive characterization of SVs. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle (GIAB) Consortium integrated 19 sequence-resolved variant calling methods, both alignment- and de novo assembly-based, from short-, linked-, and long-read sequencing, as well as optical and electronic mapping. The final benchmark set contains 12745 isolated, sequence-resolved insertion and deletion calls =50 base pairs (bp) discovered by at least 2 technologies or 5 callsets, genotyped as heterozygous or homozygous variants by long reads. The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.66 Gbp and 9641 SVs supported by at least one diploid assembly. Support for SVs was assessed using svviz with short-, linked-, and long-read sequence data. In general, there was strong support from multiple technologies for the benchmark SVs, with 90 % of the Tier 1 SVs having support in reads from more than one technology. The Mendelian genotype error rate was 0.3 %, and genotype concordance with manual curation was >98.7 %. We demonstrate the utility of the benchmark set by showing it reliably identifies both false negatives and false positives in high-quality SV callsets from short-, linked-, and long-read sequencing and optical mapping.


April 21, 2020  |  

Transcriptional initiation of a small RNA, not R-loop stability, dictates the frequency of pilin antigenic variation in Neisseria gonorrhoeae.

Neisseria gonorrhoeae, the sole causative agent of gonorrhea, constitutively undergoes diversification of the Type IV pilus. Gene conversion occurs between one of the several donor silent copies located in distinct loci and the recipient pilE gene, encoding the major pilin subunit of the pilus. A guanine quadruplex (G4) DNA structure and a cis-acting sRNA (G4-sRNA) are located upstream of the pilE gene and both are required for pilin antigenic variation (Av). We show that the reduced sRNA transcription lowers pilin Av frequencies. Extended transcriptional elongation is not required for Av, since limiting the transcript to 32 nt allows for normal Av frequencies. Using chromatin immunoprecipitation (ChIP) assays, we show that cellular G4s are less abundant when sRNA transcription is lower. In addition, using ChIP, we demonstrate that the G4-sRNA forms a stable RNA:DNA hybrid (R-loop) with its template strand. However, modulating R-loop levels by controlling RNase HI expression does not alter G4 abundance quantified through ChIP. Since pilin Av frequencies were not altered when modulating R-loop levels by controlling RNase HI expression, we conclude that transcription of the sRNA is necessary, but stable R-loops are not required to promote pilin Av. © 2019 John Wiley & Sons Ltd.


April 21, 2020  |  

How Genomics Is Changing What We Know About the Evolution and Genome of Bordetella pertussis.

The evolution of Bordetella pertussis from a common ancestor similar to Bordetella bronchiseptica has occurred through large-scale gene loss, inactivation and rearrangements, largely driven by the spread of insertion sequence element repeats throughout the genome. B. pertussis is widely considered to be monomorphic, and recent evolution of the B. pertussis genome appears to, at least in part, be driven by vaccine-based selection. Given the recent global resurgence of whooping cough despite the wide-spread use of vaccination, a more thorough understanding of B. pertussis genomics could be highly informative. In this chapter we discuss the evolution of B. pertussis, including how vaccination is changing the circulating B. pertussis population at the gene-level, and how new sequencing technologies are revealing previously unknown levels of inter- and intra-strain variation at the genome-level.


April 21, 2020  |  

Comparison of mitochondrial DNA variants detection using short- and long-read sequencing.

The recent advent of long-read sequencing technologies is expected to provide reasonable answers to genetic challenges unresolvable by short-read sequencing, primarily the inability to accurately study structural variations, copy number variations, and homologous repeats in complex parts of the genome. However, long-read sequencing comes along with higher rates of random short deletions and insertions, and single nucleotide errors. The relatively higher sequencing accuracy of short-read sequencing has kept it as the first choice of screening for single nucleotide variants and short deletions and insertions. Albeit, short-read sequencing still suffers from systematic errors that tend to occur at specific positions where a high depth of reads is not always capable to correct for these errors. In this study, we compared the genotyping of mitochondrial DNA variants in three samples using PacBio’s Sequel (Pacific Biosciences Inc., Menlo Park, CA, USA) long-read sequencing and illumina’s HiSeqX10 (illumine Inc., San Diego, CA, USA) short-read sequencing data. We concluded that, despite the differences in the type and frequency of errors in the long-reads sequencing, its accuracy is still comparable to that of short-reads for genotyping short nuclear variants; due to the randomness of errors in long reads, a lower coverage, around 37 reads, can be sufficient to correct for these random errors.


April 21, 2020  |  

Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline

Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and allow for annotation of TEs. There are numerous methods for each class of elements with unknown relative performance metrics. We benchmarked existing programs based on a curated library of rice TEs. Using the most robust programs, we created a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a condensed TE library for annotations of structurally intact and fragmented elements. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.List of abbreviationsTETransposable ElementsLTRLong Terminal RepeatLINELong Interspersed Nuclear ElementSINEShort Interspersed Nuclear ElementMITEMiniature Inverted Transposable ElementTIRTerminal Inverted RepeatTSDTarget Site DuplicationTPTrue PositivesFPFalse PositivesTNTrue NegativeFNFalse NegativesGRFGeneric Repeat FinderEDTAExtensive de-novo TE Annotator


April 21, 2020  |  

Membrane proteomic analysis reveals overlapping and independent functions of Streptococcus mutans Ffh, YidC1, and YidC2.

A comparative proteomic analysis was utilized to evaluate similarities and differences in membrane samples derived from the cariogenic bacterium Streptococcus mutans, including the wild-type strain and four mutants devoid of protein translocation machinery components, specifically ?ffh, ?yidC1, ?yidC2, or ?ffh/yidC1. The purpose of this work was to determine the extent to which the encoded proteins operate individually or in concert with one another and to identify the potential substrates of the respective pathways. Ffh is the principal protein component of the signal recognition particle (SRP), while yidC1 and yidC2 are dual paralogs encoding members of the YidC/Oxa/Alb family of membrane-localized chaperone insertases. Our results suggest that the co-translational SRP pathway works in concert with either YidC1 or YidC2 specifically, or with no preference for paralog, in the insertion of most membrane-localized substrates. A few instances were identified in which the SRP pathway alone, or one of the YidCs alone, appeared to be most relevant. These data shed light on underlying reasons for differing phenotypic consequences of ffh, yidC1 or yidC2 deletion. Our data further suggest that many membrane proteins present in a ?yidC2 background may be non-functional, that ?yidC1 is better able to adapt physiologically to the loss of this paralog, that shared phenotypic properties of ?ffh and ?yidC2 mutants can stem from impacts on different proteins, and that independent binding to ribosomal proteins is not a primary functional activity of YidC2. Lastly, genomic mutations accumulate in a ?yidC2 background coincident with phenotypic reversion, including an apparent W138R suppressor mutation within yidC1. © 2019 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.


April 21, 2020  |  

Emergence of plasmid-mediated high-level tigecycline resistance genes in animals and humans.

Tigecycline is a last-resort antibiotic that is used to treat severe infections caused by extensively drug-resistant bacteria. tet(X) has been shown to encode a flavin-dependent monooxygenase that modifies tigecycline1,2. Here, we report two unique mobile tigecycline-resistance genes, tet(X3) and tet(X4), in numerous Enterobacteriaceae and Acinetobacter that were isolated from animals, meat for consumption and humans. Tet(X3) and Tet(X4) inactivate all tetracyclines, including tigecycline and the newly FDA-approved eravacycline and omadacycline. Both tet(X3) and tet(X4) increase (by 64-128-fold) the tigecycline minimal inhibitory concentration values for Escherichia coli, Klebsiella pneumoniae and Acinetobacter baumannii. In addition, both Tet(X3) (A. baumannii) and Tet(X4) (E. coli) significantly compromise tigecycline in in vivo infection models. Both tet(X3) and tet(X4) are adjacent to insertion sequence ISVsa3 on their respective conjugative plasmids and confer a mild fitness cost (relative fitness of >0.704). Database mining and retrospective screening analyses confirm that tet(X3) and tet(X4) are globally present in clinical bacteria-even in the same bacteria as blaNDM-1, resulting in resistance to both tigecycline and carbapenems. Our findings suggest that both the surveillance of tet(X) variants in clinical and animal sectors and the use of tetracyclines in food production require urgent global attention.


April 21, 2020  |  

Complete genome sequence and characterization of virulence genes in Lancefield group C Streptococcus dysgalactiae isolated from farmed amberjack (Seriola dumerili).

Lancefield group C Streptococcus dysgalactiae causes infections in farmed fish. Here, the genome of S. dysgalactiae strain kdys0611, isolated from farmed amberjack (Seriola dumerili) was sequenced. The complete genome sequence of kdys0611 consists of a single chromosome and five plasmids. The chromosome is 2,142,780?bp long and has a GC content of 40%. It possesses 2061 coding sequences and 67 tRNA and 6 rRNA operons. One clustered regularly interspaced short palindromic repeat, 125 insertion sequences, and four predicted prophage elements were identified. Phylogenetic analysis based on 126 core genes suggested that the kdys0611 strain is more closely related to S. dysgalactiae subsp. dysgalactiae than to S. dysgalactiae subsp. equisimilis. The genome of kdys0611 harbors 87 genes with sequence similarity to putative virulence-associated genes identified in other bacteria, of which 57 exhibit amino acid identity (>52%) to genes of the S. dysgalactiae subsp. equisimilis GGS124 human clinical isolate. Four putative virulence genes, emm5 (FGCSD_0256), spg_2 (FGCSD_1961), skc (FGCSD_1012), and cna (FGCSD_0159), in kdys0611 did not show significant homology with any deposited S. dysgalactiae genes. The chromosomal sequence of kdys0611 has been deposited in GenBank under Accession No. AP018726. This is the first report of the complete genome sequence of S. dysgalactiae isolated from fish. © 2019 The Societies and John Wiley & Sons Australia, Ltd.


April 21, 2020  |  

Detection of transferable oxazolidinone resistance determinants in Enterococcus faecalis and Enterococcus faecium of swine origin in Sichuan Province, China.

The aim of this study was to detect the transferable oxazolidinone resistance determinants (cfr, optrA and poxtA) in E. faecalis and E. faecium of swine origin in Sichuan Province, China.A total of 158 enterococci strains (93 E. faecalis and 65 E. faecium) isolated from 25 large-scale swine farms were screened for the presence of cfr, optrA and poxtA by PCR. The genetic environments of cfr, optrA and poxtA were characterized by whole genome sequencing. Transfer of oxazolidinone resistance determinants was determined by conjugation or electrotransformation experiments.The transferable oxazolidinone resistance determinants, cfr, optrA and poxtA, were detected in zero, six, and one enterococci strains, respectively. The poxtA in one E. faecalis strain was located on a 37,990 bp plasmid, which co-harbored fexB, cat, tet(L) and tet(M), and could be conjugated to E. faecalis JH2-2. One E. faecalis strain harbored two different OptrA variants, including one variant with a single substitution, Q219H, which has not been reported previously. Two optrA-carrying plasmids, pC25-1, with a size of 45,581 bp, and pC54, with a size of 64,500 bp, shared a 40,494 bp identical region that contained genetic context IS1216E-fexA-optrA-erm(A)-IS1216E, which could be electrotransformed into Staphylococcus aureus. Four different chromosomal optrA gene clusters were found in five strains, in which optrA was associated with Tn554 or Tn558 that were inserted into the radC gene.Our study highlights the fact that mobile genetic elements, such as plasmids, IS1216E, Tn554 and Tn558, may facilitate the horizontal transmission of optrA or poxtA.Copyright © 2019. Published by Elsevier Ltd.


April 21, 2020  |  

Insect genomes: progress and challenges.

In the wake of constant improvements in sequencing technologies, numerous insect genomes have been sequenced. Currently, 1219 insect genome-sequencing projects have been registered with the National Center for Biotechnology Information, including 401 that have genome assemblies and 155 with an official gene set of annotated protein-coding genes. Comparative genomics analysis showed that the expansion or contraction of gene families was associated with well-studied physiological traits such as immune system, metabolic detoxification, parasitism and polyphagy in insects. Here, we summarize the progress of insect genome sequencing, with an emphasis on how this impacts research on pest control. We begin with a brief introduction to the basic concepts of genome assembly, annotation and metrics for evaluating the quality of draft assemblies. We then provide an overview of genome information for numerous insect species, highlighting examples from prominent model organisms, agricultural pests and disease vectors. We also introduce the major insect genome databases. The increasing availability of insect genomic resources is beneficial for developing alternative pest control methods. However, many opportunities remain for developing data-mining tools that make maximal use of the available insect genome resources. Although rapid progress has been achieved, many challenges remain in the field of insect genomics. © 2019 The Royal Entomological Society.


April 21, 2020  |  

Full-length mRNA sequencing and gene expression profiling reveal broad involvement of natural antisense transcript gene pairs in pepper development and response to stresses.

Pepper is an important vegetable with great economic value and unique biological features. In the past few years, significant development has been made towards understanding the huge complex pepper genome; however, pepper functional genomics has not been well studied. To better understand the pepper gene structure and pepper gene regulation, we conducted full-length mRNA sequencing by PacBio sequencing and obtained 57862 high-quality full-length mRNA sequences derived from 18362 previously annotated and 5769 newly detected genes. New gene models were built that combined the full-length mRNA sequences and corrected approximately 500 fragmented gene models from previous annotations. Based on the full-length mRNA, we identified 4114 and 5880 pepper genes forming natural antisense transcript (NAT) genes in-cis and in-trans, respectively. Most of these genes accumulate small RNAs in their overlapping regions. By analyzing these NAT gene expression patterns in our transcriptome data, we identified many NAT pairs responsive to a variety of biological processes in pepper. Pepper formate dehydrogenase 1 (FDH1), which is required for R-gene-mediated disease resistance, may be regulated by nat-siRNAs and participate in a positive feedback loop in salicylic acid biosynthesis during resistance responses. Several cis-NAT pairs and subgroups of trans-NAT genes were responsive to pepper pericarp and placenta development, which may play roles in capsanthin and capsaicin biosynthesis. Using a comparative genomics approach, the evolutionary mechanisms of cis-NATs were investigated, and we found that an increase in intergenic sequences accounted for the loss of most cis-NATs, while transposon insertion contributed to the formation of most new cis-NATs. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.


April 21, 2020  |  

Defining transgene insertion sites and off-target effects of homology-based gene silencing informs the use of functional genomics tools in Phytophthora infestans.

DNA transformation and homology-based transcriptional silencing are frequently used to assess gene function in Phytophthora. Since unplanned side-effects of these tools are not well-characterized, we used P. infestans to study plasmid integration sites and whether knockdowns caused by homology-dependent silencing spreads to other genes. Insertions occurred both in gene-dense and gene-sparse regions but disproportionately near the 5′ ends of genes, which disrupted native coding sequences. Microhomology at the recombination site between plasmid and chromosome was common. Studies of transformants silenced for twelve different gene targets indicated that neighbors within 500-nt were often co-silenced, regardless of whether hairpin or sense constructs were employed and the direction of transcription of the target. However, cis-spreading of silencing did not occur in all transformants obtained with the same plasmid. Genome-wide studies indicated that unlinked genes with partial complementarity with the silencing-inducing transgene were not usually down-regulated. We learned that hairpin or sense transgenes were not co-silenced with the target in all transformants, which informs how screens for silencing should be performed. We conclude that transformation and gene silencing can be reliable tools for functional genomics in Phytophthora but must be used carefully, especially by testing for the spread of silencing to genes flanking the target.


April 21, 2020  |  

Genome rearrangements induce biofilm formation in Escherichia coli C, an old model organism with a new application in biofilm research

Escherichia coli C forms more robust biofilms than the other laboratory strains. Biofilm formation and cell aggregation under a high shear force depends on temperature and salt concentrations. It is the last of five E. coli strains (C, K12, B, W, Crooks) designated as safe for laboratory purposes whose genome has not been sequenced. Here we present the complete genomic sequence of this strain in which we utilized both long-read PacBio-based sequencing and high resolution optical mapping to confirm a large inversion in comparison to the other laboratory strains. Notably, DNA sequence comparison revealed the absence of several genes thought to be involved in biofilm formation, including antigen 43, waaSBOJYZUL for LPS synthesis, and cpsB for curli synthesis. The first main difference we identified that likely affects biofilm formation is the presence of an IS3-like insertion sequence in front of the carbon storage regulator csrA gene. This insertion is located 86 bp upstream of the csrA start codon inside the -35 region of P4 promoter and blocks the transcription from the sigma32 and sigma70 promoters P1-P3 located further upstream. The second is the presence of an IS5/IS1182 in front of the csgD gene, which may drive its overexpression in biofilm. And finally, E. coli C encodes an additional sigma70 subunit overexpressed in biofilm and driven by the same IS3-like insertion sequence. Promoter analyses using GFP gene fusions and total expression profiles using RNA-seq analyses comparing planktonic and biofilm envirovars provided insights into understanding this regulatory pathway in E. coli.


April 21, 2020  |  

Insertion sequences drive the emergence of a highly adapted human pathogen.

Pseudomonas aeruginosa is a highly adaptive opportunistic pathogen that can have serious health consequences in patients with lung disorders. Taxonomic outliers of P. aeruginosa of environmental origin have recently emerged as infectious for humans. Here, we present the first genome-wide analysis of an isolate that caused fatal haemorrhagic pneumonia. In two clones, CLJ1 and CLJ3, sequentially recovered from a patient with chronic pulmonary disease, insertion of a mobile genetic element into the P. aeruginosa chromosome affected major virulence-associated phenotypes and led to increased resistance to the antibiotics used to combat the infection. Comparative genome, proteome and transcriptome analyses revealed that this ISL3-family insertion sequence disrupted the genes for flagellar components, type IV pili, O-specific antigens, translesion polymerase and enzymes producing hydrogen cyanide. Seven-fold more insertions were detected in the later isolate, CLJ3, than in CLJ1, some of which modified strain susceptibility to antibiotics by disrupting the genes for the outer-membrane porin OprD and the regulator of ß-lactamase expression AmpD. In the Galleria mellonella larvae model, the two strains displayed different levels of virulence, with CLJ1 being highly pathogenic. This study revealed insertion sequences to be major players in enhancing the pathogenic potential of a P. aeruginosa taxonomic outlier by modulating both its virulence and its resistance to antimicrobials, and explains how this bacterium adapts from the environment to a human host.


April 21, 2020  |  

Genome-wide selection footprints and deleterious variations in young Asian allotetraploid rapeseed.

Brassica napus (AACC, 2n = 38) is an important oilseed crop grown worldwide. However, little is known about the population evolution of this species, the genomic difference between its major genetic groups, such as European and Asian rapeseed, and the impacts of historical large-scale introgression events on this young tetraploid. In this study, we reported the de novo assembly of the genome sequences of an Asian rapeseed (B. napus), Ningyou 7, and its four progenitors and compared these genomes with other available genomic data from diverse European and Asian cultivars. Our results showed that Asian rapeseed originally derived from European rapeseed but subsequently significantly diverged, with rapid genome differentiation after hybridization and intensive local selective breeding. The first historical introgression of B. rapa dramatically broadened the allelic pool but decreased the deleterious variations of Asian rapeseed. The second historical introgression of the double-low traits of European rapeseed (canola) has reshaped Asian rapeseed into two groups (double-low and double-high), accompanied by an increase in genetic load in the double-low group. This study demonstrates distinctive genomic footprints and deleterious SNP (single nucleotide polymorphism) variants for local adaptation by recent intra- and interspecies introgression events and provides novel insights for understanding the rapid genome evolution of a young allopolyploid crop. © 2019 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.