Menu
September 22, 2019

Unexpected patterns of segregation distortion at a selfish supergene in the fire ant Solenopsis invicta.

The Sb supergene in the fire ant Solenopsis invicta determines the form of colony social organization, with colonies whose inhabitants bear the element containing multiple reproductive queens and colonies lacking it containing only a single queen. Several features of this supergene – including suppressed recombination, presence of deleterious mutations, association with a large centromere, and “green-beard” behavior – suggest that it may be a selfish genetic element that engages in transmission ratio distortion (TRD), defined as significant departures in progeny allele frequencies from Mendelian inheritance ratios. We tested this possibility by surveying segregation ratios in embryo progenies of 101 queens of the “polygyne” social form (3512 embryos) using three supergene-linked markers and twelve markers outside the supergene.Significant departures from Mendelian ratios were observed at the supergene loci in 3-5 times more progenies than expected in the absence of TRD and than found, on average, among non-supergene loci. Also, supergene loci displayed the greatest mean deviations from Mendelian ratios among all study loci, although these typically were modest. A surprising feature of the observed inter-progeny variation in TRD was that significant deviations involved not only excesses of supergene alleles but also similarly frequent excesses of the alternate alleles on the homologous chromosome. As expected given the common occurrence of such “drive reversal” in this system, alleles associated with the supergene gain no consistent transmission advantage over their alternate alleles at the population level. Finally, we observed low levels of recombination and incomplete gametic disequilibrium across the supergene, including between adjacent markers within a single inversion.Our data confirm the prediction that the Sb supergene is a selfish genetic element capable of biasing its own transmission during reproduction, yet counterselection for suppressor loci evidently has produced an evolutionary stalemate in TRD between the variant homologous haplotypes on the “social chromosome”. Evidence implicates prezygotic segregation distortion as responsible for the TRD we document, with “true” meiotic drive the most likely mechanism. Low levels of recombination and incomplete gametic disequilibrium across the supergene suggest that selection does not preserve a single uniform supergene haplotype responsible for inducing polygyny.


September 22, 2019

Improved reference genome for the domestic horse increases assembly contiguity and composition.

Recent advances in genomic sequencing technology and computational assembly methods have allowed scientists to improve reference genome assemblies in terms of contiguity and composition. EquCab2, a reference genome for the domestic horse, was released in 2007. Although of equal or better quality compared to other first-generation Sanger assemblies, it had many of the shortcomings common to them. In 2014, the equine genomics research community began a project to improve the reference sequence for the horse, building upon the solid foundation of EquCab2 and incorporating new short-read data, long-read data, and proximity ligation data. Here, we present EquCab3. The count of non-N bases in the incorporated chromosomes is improved from 2.33?Gb in EquCab2 to 2.41?Gb in EquCab3. Contiguity has also been improved nearly 40-fold with a contig N50 of 4.5?Mb and scaffold contiguity enhanced to where all but one of the 32 chromosomes is comprised of a single scaffold.


September 22, 2019

The enterococcus cassette chromosome, a genomic variation enabler in enterococci.

Enterococcus faecium has a highly variable genome prone to recombination and horizontal gene transfer. Here, we have identified a novel genetic island with an insertion locus and mobilization genes similar to those of staphylococcus cassette chromosome elements SCCmec This novel element termed the enterococcus cassette chromosome (ECC) element was located in the 3′ region of rlmH and encoded large serine recombinases ccrAB similar to SCCmec Horizontal transfer of an ECC element termed ECC::cat containing a knock-in cat chloramphenicol resistance determinant occurred in the presence of a conjugative reppLG1 plasmid. We determined the ECC::cat insertion site in the 3′ region of rlmH in the E. faecium recipient by long-read sequencing. ECC::cat also mobilized by homologous recombination through sequence identity between flanking insertion sequence (IS) elements in ECC::cat and the conjugative plasmid. The ccrABEnt genes were found in 69 of 516 E. faecium genomes in GenBank. Full-length ECC elements were retrieved from 32 of these genomes. ECCs were flanked by attR and attL sites of approximately 50?bp. The attECC sequences were found by PCR and sequencing of circularized ECCs in three strains. The genes in ECCs contained an amalgam of common and rare E. faecium genes. Taken together, our data imply that ECC elements act as hot spots for genetic exchange and contribute to the large variation of accessory genes found in E. faeciumIMPORTANCEEnterococcus faecium is a bacterium found in a great variety of environments, ranging from the clinic as a nosocomial pathogen to natural habitats such as mammalian intestines, water, and soil. They are known to exchange genetic material through horizontal gene transfer and recombination, leading to great variability of accessory genes and aiding environmental adaptation. Identifying mobile genetic elements causing sequence variation is important to understand how genetic content variation occurs. Here, a novel genetic island, the enterococcus cassette chromosome, is shown to contain a wealth of genes, which may aid E. faecium in adapting to new environments. The transmission mechanism involves the only two conserved genes within ECC, ccrABEnt, large serine recombinases that insert ECC into the host genome similarly to SCC elements found in staphylococci. Copyright © 2018 Sivertsen et al.


September 22, 2019

Noise-Cancelling Repeat Finder: Uncovering tandem repeats in error-prone long-read sequencing data

Tandem DNA repeats can be sequenced with long-read technologies, but cannot be accurately deciphered due to the lack of computational tools taking high error rates of these technologies into account. Here we introduce Noise-Cancelling Repeat Finder (NCRF) to uncover putative tandem repeats of specified motifs in noisy long reads produced by Pacific Biosciences and Oxford Nanopore sequencers. Using simulations, we validated the use of NCRF to locate tandem repeats with motifs of various lengths and demonstrated its superior performance as compared to two alternative tools. Using real human whole-genome sequencing data, NCRF identified long arrays of the (AATGG)n repeat involved in heat shock stress response.


September 22, 2019

Analysis of structural variants in four African cichlids highlights an association with developmental and immune related genes

African Lakes Cichlids are one of the most impressive example of adaptive radiation. Independently in Lake Victoria, Tanganyika, and Malawi, several hundreds of species arose within the last 10 million to 100,000 years. Whereas most analyses in cichlids focused on nucleotide substitutions across species to investigate the genetic bases of this explosive radiation, to date, no study has investigated the contribution of structural variants (SVs) to speciation events (through a reduction of gene flow) and adaptation to different ecological niches. Here, we annotate and characterize the repertoires and evolutionary potential of different SV classes (deletion, duplication, inversion, insertions and translocations) in five cichlid species (Astatotilapia burtoni, Metriaclima zebra, Neolamprologus brichardi, Pundamilia nyererei and Oreochromis niloticus). We investigate the patterns of gain/loss evolution across the phylogeny for each SV type enabling the identification of both lineage specific events and a set of conserved SVs, common to all four species in the radiation. Both deletion and inversion events show a significant overlap with SINE elements, while inversions additionally show a limited, but significant association with DNA transposons. Genes lying inside inverted regions are enriched for genes regulating behaviour, or involved in skeletal and visual system development. Moreover, we find that duplicated genes show enrichment for textquoterightantigen processing and presentationtextquoteright (GO:0019882) and other immune related categories. Altogether, we provide the first, comprehensive overview of rearrangement evolution in East African Cichlids, and some initial insights into their possible contribution to adaptation.


September 22, 2019

Extensive and deep sequencing of the Venter/HuRef genome for developing and benchmarking genome analysis tools.

We produced an extensive collection of deep re-sequencing datasets for the Venter/HuRef genome using the Illumina massively-parallel DNA sequencing platform. The original Venter genome sequence is a very-high quality phased assembly based on Sanger sequencing. Therefore, researchers developing novel computational tools for the analysis of human genome sequence variation for the dominant Illumina sequencing technology can test and hone their algorithms by making variant calls from these Venter/HuRef datasets and then immediately confirm the detected variants in the Sanger assembly, freeing them of the need for further experimental validation. This process also applies to implementing and benchmarking existing genome analysis pipelines. We prepared and sequenced 200?bp and 350?bp short-insert whole-genome sequencing libraries (sequenced to 100x and 40x genomic coverages respectively) as well as 2?kb, 5?kb, and 12?kb mate-pair libraries (49x, 122x, and 145x physical coverages respectively). Lastly, we produced a linked-read library (128x physical coverage) from which we also performed haplotype phasing.


September 22, 2019

Density-dependent enhanced replication of a densovirus in Wolbachia-infected Aedes cells is associated with production of piRNAs and higher virus-derived siRNAs.

The endosymbiotic bacterium Wolbachia pipientis has been shown to restrict a range of RNA viruses in Drosophila melanogaster and transinfected dengue mosquito, Aedes aegypti. Here, we show that Wolbachia infection enhances replication of Aedes albopictus densovirus (AalDNV-1), a single stranded DNA virus, in Aedes cell lines in a density-dependent manner. Analysis of previously produced small RNAs of Aag2 cells showed that Wolbachia-infected cells produced greater absolute abundance of virus-derived short interfering RNAs compared to uninfected cells. Additionally, we found production of virus-derived PIWI-like RNAs (vpiRNA) produced in response to AalDNV-1 infection. Nuclear fractions of Aag2 cells produced a primary vpiRNA signature U1 bias whereas the typical “ping-pong” signature (U1 – A10) was evident in vpiRNAs from the cytoplasmic fractions. This is the first report of the density-dependent enhancement of DNA viruses by Wolbachia. Further, we report the generation of vpiRNAs in a DNA virus-host interaction for the first time. Copyright © 2018 Elsevier Inc. All rights reserved.


September 22, 2019

Hypervirulent group A Streptococcus emergence in an acaspular background is associated with marked remodeling of the bacterial cell surface

Inactivating mutations in the control of virulence two-component regulatory system (covRS) often account for the hypervirulent phenotype in severe, invasive group A streptococcal (GAS) infections. As CovR represses production of the anti-phagocytic hyaluronic acid capsule, high level capsule production is generally considered critical to the hypervirulent phenotype induced by CovRS inactivation. There have recently been large outbreaks of GAS strains lacking capsule, but there are currently no data on the virulence of covRS-mutated, acapsular strains in vivo. We investigated the impact of CovRS inactivation in acapsular serotype M4 strains using a wild-type (M4-SC-1) and a naturally-occurring CovS-inactivated strain (M4-LC-1) that contains an 11bp covS insertion. M4-LC-1 was significantly more virulent in a mouse bacteremia model but caused smaller lesions in a subcutaneous mouse model. Over 10% of the genome showed significantly different transcript levels in M4-LC-1 vs. M4-SC-1 strain. Notably, the Mga regulon and multiple cell surface protein-encoding genes were strongly upregulated–a finding not observed for CovS-inactivated, encapsulated M1 or M3 GAS strains. Consistent with the transcriptomic data, transmission electron microscopy revealed markedly altered cell surface morphology of M4-LC-1 compared to M4-SC-1. Insertional inactivation of covS in M4-SC-1 recapitulated the transcriptome and cell surface morphology. Analysis of the cell surface following CovS-inactivation revealed that the upregulated proteins were part of the Mga regulon. Inactivation of mga in M4-LC-1 reduced transcript levels of multiple cell surface proteins and reversed the cell surface alterations consistent with the effect of CovS inactivation on cell surface composition being mediated by Mga. CovRS-inactivating mutations were detected in 20% of current invasive serotype M4 strains in the United States. Thus, we discovered that hypervirulent M4 GAS strains with covRS mutations can arise in an acapsular background and that such hypervirulence is associated with profound alteration of the cell surface.


September 22, 2019

Detection and visualization of complex structural variants from long reads.

With applications in cancer, drug metabolism, and disease etiology, understanding structural variation in the human genome is critical in advancing the thrusts of individualized medicine. However, structural variants (SVs) remain challenging to detect with high sensitivity using short read sequencing technologies. This problem is exacerbated when considering complex SVs comprised of multiple overlapping or nested rearrangements. Longer reads, such as those from Pacific Biosciences platforms, often span multiple breakpoints of such events, and thus provide a way to unravel small-scale complexities in SVs with higher confidence.We present CORGi (COmplex Rearrangement detection with Graph-search), a method for the detection and visualization of complex local genomic rearrangements. This method leverages the ability of long reads to span multiple breakpoints to untangle SVs that appear very complicated with respect to a reference genome. We validated our approach against both simulated long reads, and real data from two long read sequencing technologies. We demonstrate the ability of our method to identify breakpoints inserted in synthetic data with high accuracy, and the ability to detect and plot SVs from NA12878 germline, achieving 88.4% concordance between the two sets of sequence data. The patterns of complexity we find in many NA12878 SVs match known mechanisms associated with DNA replication and structural variant formation, and highlight the ability of our method to automatically label complex SVs with an intuitive combination of adjacent or overlapping reference transformations.CORGi is a method for interrogating genomic regions suspected to contain local rearrangements using long reads. Using pairwise alignments and graph search CORGi produces labels and visualizations for local SVs of arbitrary complexity.


September 22, 2019

Phototaxis in a wild isolate of the cyanobacterium Synechococcus elongatus.

Many cyanobacteria, which use light as an energy source via photosynthesis, have evolved the ability to guide their movement toward or away from a light source. This process, termed “phototaxis,” enables organisms to localize in optimal light environments for improved growth and fitness. Mechanisms of phototaxis have been studied in the coccoid cyanobacterium Synechocystis sp. strain PCC 6803, but the rod-shaped Synechococcus elongatus PCC 7942, studied for circadian rhythms and metabolic engineering, has no phototactic motility. In this study we report a recent environmental isolate of S. elongatus, the strain UTEX 3055, whose genome is 98.5% identical to that of PCC 7942 but which is motile and phototactic. A six-gene operon encoding chemotaxis-like proteins was confirmed to be involved in phototaxis. Environmental light signals are perceived by a cyanobacteriochrome, PixJSe (Synpcc7942_0858), which carries five GAF domains that are responsive to blue/green light and resemble those of PixJ from Synechocystis Plate-based phototaxis assays indicate that UTEX 3055 uses PixJSe to sense blue and green light. Mutation of conserved functional cysteine residues in different GAF domains indicates that PixJSe controls both positive and negative phototaxis, in contrast to the multiple proteins that are employed for implementing bidirectional phototaxis in Synechocystis.


September 22, 2019

Integrative haplotype estimation with sub-linear complexity

The number of human genomes being genotyped or sequenced increases exponentially and efficient haplotype estimation methods able to handle this amount of data are now required. Here, we present a new method, SHAPEIT4, which substantially improves upon other methods to process large genotype and high coverage sequencing datasets. It notably exhibits sub-linear scaling with sample size, provides highly accurate haplotypes and allows integrating external phasing information such as large reference panels of haplotypes, collections of pre-phased variants and long sequencing reads. We provide SHAPET4 in an open source format on https://odelaneau.github.io/shapeit4/ and demonstrate its performance in terms of accuracy and running times on two gold standard datasets: the UK Biobank data and the Genome In A Bottle.


September 21, 2019

Mistranslation drives the evolution of robustness in TEM-1 ß-lactamase.

How biological systems such as proteins achieve robustness to ubiquitous perturbations is a fundamental biological question. Such perturbations include errors that introduce phenotypic mutations into nascent proteins during the translation of mRNA. These errors are remarkably frequent. They are also costly, because they reduce protein stability and help create toxic misfolded proteins. Adaptive evolution might reduce these costs of protein mistranslation by two principal mechanisms. The first increases the accuracy of translation via synonymous “high fidelity” codons at especially sensitive sites. The second increases the robustness of proteins to phenotypic errors via amino acids that increase protein stability. To study how these mechanisms are exploited by populations evolving in the laboratory, we evolved the antibiotic resistance gene TEM-1 in Escherichia coli hosts with either normal or high rates of mistranslation. We analyzed TEM-1 populations that evolved under relaxed and stringent selection for antibiotic resistance by single molecule real-time sequencing. Under relaxed selection, mistranslating populations reduce mistranslation costs by reducing TEM-1 expression. Under stringent selection, they efficiently purge destabilizing amino acid changes. More importantly, they accumulate stabilizing amino acid changes rather than synonymous changes that increase translational accuracy. In the large populations we study, and on short evolutionary timescales, the path of least resistance in TEM-1 evolution consists of reducing the consequences of translation errors rather than the errors themselves.


September 21, 2019

Whole genome sequence of the soybean aphid, Aphis glycines.

Aphids are emerging as model organisms for both basic and applied research. Of the 5,000 estimated species, only three aphids have published whole genome sequences: the pea aphid Acyrthosiphon pisum, the Russian wheat aphid, Diuraphis noxia, and the green peach aphid, Myzus persicae. We present the whole genome sequence of a fourth aphid, the soybean aphid (Aphis glycines), which is an extreme specialist and an important invasive pest of soybean (Glycine max). The availability of genomic resources is important to establish effective and sustainable pest control, as well as to expand our understanding of aphid evolution. We generated a 302.9 Mbp draft genome assembly for Ap. glycines using a hybrid sequencing approach. This assembly shows high completeness with 19,182 predicted genes, 92% of known Ap. glycines transcripts mapping to contigs, and substantial continuity with a scaffold N50 of 174,505 bp. The assembly represents 95.5% of the predicted genome size of 317.1 Mbp based on flow cytometry. Ap. glycines contains the smallest known aphid genome to date, based on updated genome sizes for 19 aphid species. The repetitive DNA content of the Ap. glycines genome assembly (81.6 Mbp or 26.94% of the 302.9 Mbp assembly) shows a reduction in the number of classified transposable elements compared to Ac. pisum, and likely contributes to the small estimated genome size. We include comparative analyses of gene families related to host-specificity (cytochrome P450’s and effectors), which may be important in Ap. glycines evolution. This Ap. glycines draft genome sequence will provide a resource for the study of aphid genome evolution, their interaction with host plants, and candidate genes for novel insect control methods. Copyright © 2017 Elsevier Ltd. All rights reserved.


September 21, 2019

Divergent selection causes whole genome differentiation without physical linkage among the targets in Spodoptera frugiperda (Noctuidae)

The process of speciation involves whole genome differentiation by overcoming gene flow between diverging populations. We have ample knowledge which evolutionary forces may cause genomic differentiation, and several speciation models have been proposed to explain the transition from genetic to genomic differentiation. However, it is still unclear what are critical conditions enabling genomic differentiation in nature. The Fall armyworm, Spodoptera frugiperda, is observed as two sympatric strains that have different host-plant ranges, suggesting the possibility of ecological divergent selection. In our previous study, we observed that these two strains show genetic differentiation across the whole genome with an unprecedentedly low extent, suggesting the possibility that whole genome sequences started to be differentiated between the strains. In this study, we analyzed whole genome sequences from these two strains from Mississippi to identify critical evolutionary factors for genomic differentiation. The genomic Fst is low (0.017) while 91.3% of 10kb windows have Fst greater than 0, suggesting genome-wide differentiation with a low extent. We identified nearly 400 outliers of genetic differentiation between strains, and found that physical linkage among these outliers is not a primary cause of genomic differentiation. Fst is not significantly correlated with gene density, a proxy for the strength of selection, suggesting that a genomic reduction in migration rate dominates the extent of local genetic differentiation. Our analyses reveal that divergent selection alone is sufficient to generate genomic differentiation, and any following diversifying factors may increase the level of genetic differentiation between diverging strains in the process of speciation.


September 21, 2019

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.

Long-read, single-molecule real-time (SMRT) sequencing is routinely used to finish microbial genomes, but available assembly methods have not scaled well to larger genomes. We introduce the MinHash Alignment Process (MHAP) for overlapping noisy, long reads using probabilistic, locality-sensitive hashing. Integrating MHAP with the Celera Assembler enabled reference-grade de novo assemblies of Saccharomyces cerevisiae, Arabidopsis thaliana, Drosophila melanogaster and a human hydatidiform mole cell line (CHM1) from SMRT sequencing. The resulting assemblies are highly continuous, include fully resolved chromosome arms and close persistent gaps in these reference genomes. Our assembly of D. melanogaster revealed previously unknown heterochromatic and telomeric transition sequences, and we assembled low-complexity sequences from CHM1 that fill gaps in the human GRCh38 reference. Using MHAP and the Celera Assembler, single-molecule sequencing can produce de novo near-complete eukaryotic assemblies that are 99.99% accurate when compared with available reference genomes.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.