Menu
July 7, 2019

Whole genome sequencing predicts novel human disease models in rhesus macaques.

Rhesus macaques are an important pre-clinical model of human disease. To advance our understanding of genomic variation that may influence disease, we surveyed genome-wide variation in 21 rhesus macaques. We employed best-practice variant calling, validated with Mendelian inheritance. Next, we used alignment data from our cohort to detect genomic regions likely to produce inaccurate genotypes, potentially due to either gene duplication or structural variation between individuals. We generated a final dataset of >16 million high confidence variants, including 13 million in Chinese-origin rhesus macaques, an increasingly important disease model. We detected an average of 131 mutations predicted to severely alter protein coding per animal, and identified 45 such variants that coincide with known pathogenic human variants. These data suggest that expanded screening of existing breeding colonies will identify novel models of human disease, and that increased genomic characterization can help inform research studies in macaques. Copyright © 2017 Elsevier Inc. All rights reserved.


July 7, 2019

Genome-wide identification of the mutation underlying fleece variation and discriminating ancestral hairy species from modern woolly sheep.

The composition and structure of fleece variation observed in mammals is a consequence of a strong selective pressure for fiber production after domestication. In sheep, fleece variation discriminates ancestral species carrying a long and hairy fleece from modern domestic sheep (Ovis aries) owning a short and woolly fleece. Here, we report that the “woolly” allele results from the insertion of an antisense EIF2S2 retrogene (called asEIF2S2) into the 3′ UTR of the IRF2BP2 gene leading to an abnormal IRF2BP2 transcript. We provide evidence that this chimeric IRF2BP2/asEIF2S2 messenger 1) targets the genuine sense EIF2S2 RNA and 2) creates a long endogenous double-stranded RNA which alters the expression of both EIF2S2 and IRF2BP2 mRNA. This represents a unique example of a phenotype arising via a RNA-RNA hybrid, itself generated through a retroposition mechanism. Our results bring new insights on the sheep population history thanks to the identification of the molecular origin of an evolutionary phenotypic variation.© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

Correspondence on Lovell et al.: response to Bornelöv et al.

While the analysis of Bornelöv et al. is informative, they provide evidence for the existence of only 3% of the reported avian missing genes set, and thus do not significantly challenge our main findings that specific groups of syntenic protein-coding genes are missing in birds.This is a response to the Correspondence article: https://www.dx.doi.org/10.1186/s13059-017-1231-1.


July 7, 2019

Resolving multicopy duplications de novo using polyploid phasing

While the rise of single-molecule sequencing systems has enabled an unprecedented rise in the ability to assemble complex regions of the genome, long segmental duplications in the genome still remain a challenging frontier in assembly. Segmental duplications are at the same time both gene rich and prone to large structural rearrangements, making the resolution of their sequences important in medical and evolutionary studies. Duplicated sequences that are collapsed in mammalian de novo assemblies are rarely identical; after a sequence is duplicated, it begins to acquire paralog-specific variants. In this paper, we study the problem of resolving the variations in multicopy, long segmental duplications by developing and utilizing algorithms for polyploid phasing. We develop two algorithms: the first one is targeted at maximizing the likelihood of observing the reads given the underlying haplotypes using discrete matrix completion. The second algorithm is based on correlation clustering and exploits an assumption, which is often satisfied in these duplications, that each paralog has a sizable number of paralog-specific variants. We develop a detailed simulation methodology and demonstrate the superior performance of the proposed algorithms on an array of simulated datasets. We measure the likelihood score as well as reconstruction accuracy, i.e., what fraction of the reads are clustered correctly. In both the performance metrics, we find that our algorithms dominate existing algorithms on more than 93% of the datasets. While the discrete matrix completion performs better on likelihood score, the correlation-clustering algorithm performs better on reconstruction accuracy due to the stronger regularization inherent in the algorithm. We also show that our correlation-clustering algorithm can reconstruct on average 7.0 haplotypes in 10-copy duplication datasets whereas existing algorithms reconstruct less than one copy on average.


July 7, 2019

The evolution of the natural killer complex; a comparison between mammals using new high-quality genome assemblies and targeted annotation.

Natural killer (NK) cells are a diverse population of lymphocytes with a range of biological roles including essential immune functions. NK cell diversity is in part created by the differential expression of cell surface receptors which modulate activation and function, including multiple subfamilies of C-type lectin receptors encoded within the NK complex (NKC). Little is known about the gene content of the NKC beyond rodent and primate lineages, other than it appears to be extremely variable between mammalian groups. We compared the NKC structure between mammalian species using new high-quality draft genome assemblies for cattle and goat; re-annotated sheep, pig, and horse genome assemblies; and the published human, rat, and mouse lemur NKC. The major NKC genes are largely in the equivalent positions in all eight species, with significant independent expansions and deletions between species, allowing us to propose a model for NKC evolution during mammalian radiation. The ruminant species, cattle and goats, have independently evolved a second KLRC locus flanked by KLRA and KLRJ, and a novel KLRH-like gene has acquired an activating tail. This novel gene has duplicated several times within cattle, while other activating receptor genes have been selectively disrupted. Targeted genome enrichment in cattle identified varying levels of allelic polymorphism between the NKC genes concentrated in the predicted extracellular ligand-binding domains. This novel recombination and allelic polymorphism is consistent with NKC evolution under balancing selection, suggesting that this diversity influences individual immune responses and may impact on differential outcomes of pathogen infection and vaccination.


July 7, 2019

MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads.

We present a tool that combines fast mapping, error correction, and de novo assembly (MECAT; accessible at https://github.com/xiaochuanle/MECAT) for processing single-molecule sequencing (SMS) reads. MECAT’s computing efficiency is superior to that of current tools, while the results MECAT produces are comparable or improved. MECAT enables reference mapping or de novo assembly of large genomes using SMS reads on a single computer.


July 7, 2019

Improved annotation of the insect vector of citrus greening disease: biocuration by a diverse genomics community.

The Asian citrus psyllid (Diaphorina citri Kuwayama) is the insect vector of the bacterium Candidatus Liberibacter asiaticus (CLas), the pathogen associated with citrus Huanglongbing (HLB, citrus greening). HLB threatens citrus production worldwide. Suppression or reduction of the insect vector using chemical insecticides has been the primary method to inhibit the spread of citrus greening disease. Accurate structural and functional annotation of the Asian citrus psyllid genome, as well as a clear understanding of the interactions between the insect and CLas, are required for development of new molecular-based HLB control methods. A draft assembly of the D. citri genome has been generated and annotated with automated pipelines. However, knowledge transfer from well-curated reference genomes such as that of Drosophila melanogaster to newly sequenced ones is challenging due to the complexity and diversity of insect genomes. To identify and improve gene models as potential targets for pest control, we manually curated several gene families with a focus on genes that have key functional roles in D. citri biology and CLas interactions. This community effort produced 530 manually curated gene models across developmental, physiological, RNAi regulatory and immunity-related pathways. As previously shown in the pea aphid, RNAi machinery genes putatively involved in the microRNA pathway have been specifically duplicated. A comprehensive transcriptome enabled us to identify a number of gene families that are either missing or misassembled in the draft genome. In order to develop biocuration as a training experience, we included undergraduate and graduate students from multiple institutions, as well as experienced annotators from the insect genomics research community. The resulting gene set (OGS v1.0) combines both automatically predicted and manually curated gene models.


July 7, 2019

Beyond speciation genes: an overview of genome stability in evolution and speciation.

Genome stability ensures individual fitness and reliable transmission of genetic information. Hybridization between diverging lineages can trigger genome instability, highlighting its potential role in post-zygotic reproductive isolation. We argue that genome instability is not merely one of several types of hybrid incompatibility, but rather that genome stability is one of the very first and most fundamental traits that can break down when two diverged genomes are combined. Future work will reveal how frequent and predictable genome instability is in hybrids, how it affects hybrid fitness, and whether it is a direct cause or consequence of speciation. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 7, 2019

LRCstats, a tool for evaluating long reads correction methods.

Third-generation sequencing (TGS) platforms that generate long reads, such as PacBio and Oxford Nanopore technologies, have had a dramatic impact on genomics research. However, despite recent improvements, TGS reads suffer from high-error rates and the development of read correction methods is an active field of research. This motivates the need to develop tools that can evaluate the accuracy of noisy long reads correction tools.We introduce LRCstats, a tool that measures the accuracy of long reads correction tools. LRCstats takes advantage of long reads simulators that provide each simulated read with an alignment to the reference genome segment they originate from, and does not rely on a step of mapping corrected reads onto the reference genome. This allows for the measurement of the accuracy of the correction while being consistent with the actual errors introduced in the simulation process used to generate noisy reads. We illustrate the usefulness of LRCstats by analyzing the accuracy of four hybrid correction methods for PacBio long reads over three datasets.https://github.com/cchauve/lrcstats.laseanl@sfu.ca or cedric.chauve@sfu.ca.Supplementary data are available at Bioinformatics online.© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com


July 7, 2019

Complete fusion of a transposon and herpesvirus created the Teratorn mobile element in medaka fish.

Mobile genetic elements (e.g., transposable elements and viruses) display significant diversity with various life cycles, but how novel elements emerge remains obscure. Here, we report a giant (180-kb long) transposon, Teratorn, originally identified in the genome of medaka, Oryzias latipes. Teratorn belongs to the piggyBac superfamily and retains the transposition activity. Remarkably, Teratorn is largely derived from a herpesvirus of the Alloherpesviridae family that could infect fish and amphibians. Genomic survey of Teratorn-like elements reveals that some of them exist as a fused form between piggyBac transposon and herpesvirus genome in teleosts, implying the generality of transposon-herpesvirus fusion. We propose that Teratorn was created by a unique fusion of DNA transposon and herpesvirus, leading to life cycle shift. Our study supports the idea that recombination is the key event in generation of novel mobile genetic elements. Teratorn is a large mobile genetic element originally identified in the small teleost fish medaka. Here, the authors show that Teratorn is derived from the fusion of a piggyBac superfamily DNA transposon and an alloherpesvirus and that it is widely found across teleost fish.


July 7, 2019

A step-by-step guide to assemble a reptilian genome.

Multiple technologies and software are now available facilitating the de novo sequencing and assembly of any vertebrate genome. Yet the quality of most available sequenced genomes is substantially poorer than that of the golden standard in the field: the human genome. Here, we present a step-by-step protocol for the successful sequencing and assembly of a high-quality snake genome that can be applied to any other reptilian or avian species. We combine the great sequencing depth and accuracy of short reads with the use of different insert size libraries for extended scaffolding followed by optical mapping. We show that this procedure improved the corn snake scaffold N50 from 3.7 kbp to 1.4 Mbp, currently making it one of the snake genomes with the longest scaffolds. Short guidelines are also given on the extraction of long DNA molecules from reptilian blood and the necessary modifications in DNA extraction protocols. This chapter is accompanied by a website ( www.reptilomics.org/stepbystep.html ), where we provide links to the suggested software, examples of input and output files, and running parameters.


July 7, 2019

Variations in 5S rDNAs in diploid and tetraploid offspring of red crucian carp × common carp.

The allotetraploid hybrid fish (4nAT) that was created in a previous study through an intergeneric cross between red crucian carp (Carassius auratus red var., ?) and common carp (Cyprinus carpio L., ?) provided an excellent platform to investigate the effect of hybridization and polyploidization on the evolution of 5S rDNA. The 5S rDNAs of paternal common carp were made up of a coding sequence (CDS) and a non-transcribed spacer (NTS) unit, and while the 5S rDNAs of maternal red crucian carp contained a CDS and a NTS unit, they also contained a variable number of interposed regions (IPRs). The CDSs of the 5S rDNAs in both parental fishes were conserved, while their NTS units seemed to have been subjected to rapid evolution.The diploid hybrid 2nF1 inherited all the types of 5S rDNAs in both progenitors and there were no signs of homeologous recombination in the 5S rDNAs of 2nF1 by sequencing of PCR products. We obtained two segments of 5S rDNA with a total length of 16,457 bp from allotetraploid offspring 4nAT through bacterial artificial chromosome (BAC) sequencing. Using this sequence together with the 5S rDNA sequences amplified from the genomic DNA of 4nAT, we deduced that the 5S rDNAs of 4nAT might be inherited from the maternal progenitor red crucian carp. Additionally, the IPRs in the 5S rDNAs of 4nAT contained A-repeats and TA-repeats, which was not the case for the IPRs in the 5S rDNAs of 2nF1. We also detected two signals of a 200-bp fragment of 5S rDNA in the chromosomes of parental progenitors and hybrid progenies by fluorescence in situ hybridization (FISH).We deduced that during the evolution of 5S rDNAs in different ploidy hybrid fishes, interlocus gene conversion events and tandem repeat insertion events might occurred in the process of polyploidization. This study provided new insights into the relationship among the evolution of 5S rDNAs, hybridization and polyploidization, which were significant in clarifying the genome evolution of polyploid fish.


July 7, 2019

Convergent evolution of Y chromosome gene content in flies.

Sex-chromosomes have formed repeatedly across Diptera from ordinary autosomes, and X-chromosomes mostly conserve their ancestral genes. Y-chromosomes are characterized by abundant gene-loss and an accumulation of repetitive DNA, yet the nature of the gene repertoire of fly Y-chromosomes is largely unknown. Here we trace gene-content evolution of Y-chromosomes across 22 Diptera species, using a subtraction pipeline that infers Y genes from male and female genome, and transcriptome data. Few genes remain on old Y-chromosomes, but the number of inferred Y-genes varies substantially between species. Young Y-chromosomes still show clear evidence of their autosomal origins, but most genes on old Y-chromosomes are not simply remnants of genes originally present on the proto-sex-chromosome that escaped degeneration, but instead were recruited secondarily from autosomes. Despite almost no overlap in Y-linked gene content in different species with independently formed sex-chromosomes, we find that Y-linked genes have evolved convergent gene functions associated with testis expression. Thus, male-specific selection appears as a dominant force shaping gene-content evolution of Y-chromosomes across fly species.While X-chromosome gene content tends to be conserved, Y-chromosome evolution is dynamic and difficult to reconstruct. Here, Mahajan and Bachtrog use a subtraction pipeline to identify Y-linked genes in 22 Diptera species, revealing patterns of Y-chromosome gene-content evolution.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.