Menu
July 7, 2019  |  

Towards integration of population and comparative genomics in forest trees.

The past decade saw the initiation of an ongoing revolution in sequencing technologies that is transforming all fields of biology. This has been driven by the advent and widespread availability of high-throughput, massively parallel short-read sequencing (MPS) platforms. These technologies have enabled previously unimaginable studies, including draft assemblies of the massive genomes of coniferous species and population-scale resequencing. Transcriptomics studies have likewise been transformed, with RNA-sequencing enabling studies in nonmodel organisms, the discovery of previously unannotated genes (novel transcripts), entirely new classes of RNAs and previously unknown regulatory mechanisms. Here we touch upon current developments in the areas of genome assembly, comparative regulomics and population genetics as they relate to studies of forest tree species.© 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.


July 7, 2019  |  

Comparative genomics of Campylobacter iguaniorum to unravel genetic regions associated with reptilian hosts.

Campylobacter iguaniorum is most closely related to the species C fetus, C hyointestinalis, and C lanienae Reptiles, chelonians and lizards in particular, appear to be a primary reservoir of this Campylobacter species. Here we report the genome comparison of C iguaniorum strain 1485E, isolated from a bearded dragon (Pogona vitticeps), and strain 2463D, isolated from a green iguana (Iguana iguana), with the genomes of closely related taxa, in particular with reptile-associated C fetus subsp. testudinum In contrast to C fetus, C iguaniorum is lacking an S-layer encoding region. Furthermore, a defined lipooligosaccharide biosynthesis locus, encoding multiple glycosyltransferases and bounded by waa genes, is absent from C iguaniorum Instead, multiple predicted glycosylation regions were identified in C iguaniorum One of these regions is > 50 kb with deviant G + C content, suggesting acquisition via lateral transfer. These similar, but non-homologous glycosylation regions were located at the same position on the genome in both strains. Multiple genes encoding respiratory enzymes not identified to date within the C. fetus clade were present. C iguaniorum shared highest homology with C hyointestinalis and C fetus. As in reptile-associated C fetus subsp. testudinum, a putative tricarballylate catabolism locus was identified. However, despite colonizing a shared host, no recent recombination between both taxa was detected. This genomic study provides a better understanding of host adaptation, virulence, phylogeny, and evolution of C iguaniorum and related Campylobacter taxa. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019  |  

A photoreceptor contributes to the natural variation of diapause induction in Daphnia magna.

Diapause is an adaptation that allows organisms to survive harsh environmental conditions. In species occurring over broad habitat ranges, both the timing and the intensity of diapause induction can vary across populations, revealing patterns of local adaptation. Understanding the genetic architecture of this fitness-related trait would help clarify how populations adapt to their local environments. In the cyclical parthenogenetic crustacean Daphnia magna, diapause induction is a phenotypic plastic life history trait linked to sexual reproduction, as asexual females have the ability to switch to sexual reproduction and produce resting stages, their sole strategy for surviving habitat deterioration. We have previously shown that the induction of resting stage production correlates with changes in photoperiod that indicate the imminence of habitat deterioration and have identified a Quantitative Trait Locus (QTL) responsible for some of the variation in the induction of resting stages. Here, new data allows us to anchor the QTL to a large scaffold and then, using a combination of a new mapping panel, targeted association mapping and selection analysis in natural populations, to identify candidate genes within the QTL. Our results show that variation in a rhodopsin photoreceptor gene plays a significant role in the variation observed in resting stage induction. This finding provides a mechanistic explanation for the link between diapause and day-length perception that has been suggested in diverse arthropod taxa. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019  |  

Decay of sexual trait genes in an asexual parasitoid wasp.

Trait loss is a widespread phenomenon with pervasive consequences for a species’ evolutionary potential. The genetic changes underlying trait loss have only been clarified in a small number of cases. None of these studies can identify whether the loss of the trait under study was a result of neutral mutation accumulation or negative selection. This distinction is relatively clear-cut in the loss of sexual traits in asexual organisms. Male-specific sexual traits are not expressed and can only decay through neutral mutations, whereas female-specific traits are expressed and subject to negative selection. We present the genome of an asexual parasitoid wasp and compare it to that of a sexual lineage of the same species. We identify a short-list of 16 genes for which the asexual lineage carries deleterious SNP or indel variants, whereas the sexual lineage does not. Using tissue-specific expression data from other insects, we show that fifteen of these are expressed in male-specific reproductive tissues. Only one deleterious variant was found that is expressed in the female-specific spermathecae, a trait that is heavily degraded and thought to be under negative selection in L. clavipes. Although the phenotypic decay of male-specific sexual traits in asexuals is generally slow compared with the decay of female-specific sexual traits, we show that male-specific traits do indeed accumulate deleterious mutations as expected by theory. Our results provide an excellent starting point for detailed study of the genomics of neutral and selected trait decay.


July 7, 2019  |  

Whole genome analysis of Yersinia ruckeri isolated over 27 years in Australia and New Zealand reveals geographical endemism over multiple lineages and recent evolution under host selection.

Yersinia ruckeri is a salmonid pathogen with widespread distribution in cool-temperate waters including Australia and New Zealand, two isolated environments with recently developed salmonid farming industries. Phylogenetic comparison of 58 isolates from Australia, New Zealand, USA, Chile, Finland and China based on non-recombinant core genome SNPs revealed multiple deep-branching lineages, with a most recent common ancestor estimated at 18?500 years BP (12?355-24?757 95% HPD) and evidence of Australasian endemism. Evolution within the Tasmanian Atlantic salmon serotype O1b lineage has been slow, with 63 SNPs describing the variance over 27 years. Isolates from the prevailing lineage are poorly/non-motile compared to a lineage pre-vaccination, introduced in 1997, which is highly motile but has not been isolated since from epizootics. A non-motile phenotype has arisen independently in Tasmania compared to Europe and USA through a frameshift in fliI, encoding the ATPase of the flagella cluster. We report for the first time lipopolysaccharide O-antigen serotype O2 isolates in Tasmania. This phenotype results from deletion of the O-antigen cluster and consequent loss of high-molecular-weight O-antigen. This phenomenon has occurred independently on three occasions on three continents (Australasia, North America and Asia) as O2 isolates from the USA, China and Tasmania share the O-antigen deletion but occupy distant lineages. Despite the European and North American origins of the Australasian salmonid stocks, the lineages of Y. ruckeri in Australia and New Zealand are distinct from those of the northern hemisphere, suggesting they are pre-existing ancient strains that have emerged and evolved with the introduction of susceptible hosts following European colonization.


July 7, 2019  |  

Chromosome assembly of large and complex genomes using multiple references

Despite the rapid development of sequencing technologies, assembly of mammalian-scale genomes into complete chromosomes remains one of the most challenging problems in bioinformatics. To help address this difficulty, we developed Ragout, a reference-assisted assembly tool that now works for large and complex genomes. Taking one or more target assemblies (generated from an NGS assembler) and one or multiple related reference genomes, Ragout infers the evolutionary relationships between the genomes and builds the final assemblies using a genome rearrangement approach. Using Ragout, we transformed NGS assemblies of 15 different Mus musculus and one Mus spretus genomes into sets of complete chromosomes, leaving less than 5% of sequence unlocalized per set. Various benchmarks, including PCR testing and realigning of long PacBio reads, suggest only a small number of structural errors in the final assemblies, comparable with direct assembly approaches. Additionally, we applied Ragout to Mus caroli and Mus pahari genomes, which exhibit karyotype-scale variations compared to other genomes from the Muridae family. Chromosome color maps confirmed most large-scale rearrangements that Ragout detected.


July 7, 2019  |  

Spontaneous chloroplast mutants mostly occur by replication slippage and show a biased pattern in the plastome of Oenothera.

Spontaneous plastome mutants have been used as a research tool since the beginning of genetics. However, technical restrictions have severely limited their contributions to research in physiology and molecular biology. Here, we used full plastome sequencing to systematically characterize a collection of 51 spontaneous chloroplast mutants in Oenothera (evening primrose). Most mutants carry only a single mutation. Unexpectedly, the vast majority of mutations do not represent single nucleotide polymorphisms but are insertions/deletions originating from DNA replication slippage events. Only very few mutations appear to be caused by imprecise double-strand break repair, nucleotide misincorporation during replication, or incorrect nucleotide excision repair following oxidative damage. U-turn inversions were not detected. Replication slippage is induced at repetitive sequences that can be very small and tend to have high A/T content. Interestingly, the mutations are not distributed randomly in the genome. The underrepresentation of mutations caused by faulty double-strand break repair might explain the high structural conservation of seed plant plastomes throughout evolution. In addition to providing a fully characterized mutant collection for future research on plastid genetics, gene expression, and photosynthesis, our work identified the spectrum of spontaneous mutations in plastids and reveals that this spectrum is very different from that in the nucleus.© 2016 American Society of Plant Biologists. All rights reserved.


July 7, 2019  |  

Susan Celniker: Foundational resources to study a dynamic genome.

The Genetics Society of America’s George W. Beadle Award honors individuals who have made outstanding contributions to the community of genetics researchers and who exemplify the qualities of its namesake. The 2016 recipient, Susan E. Celniker, played a key role in the sequencing, annotation, and characterization of the Drosophila genome. She participated in early sequencing efforts at the Lawrence Berkeley National Laboratory and led the modENCODE Fly Transcriptome Consortium. Her efforts were critical to ensuring that the Drosophila genome was well-annotated, making it one of the best curated animal genomes available. As the Principal Investigator for the BDGP, Celniker has enabled the study of proteomes by creating a collection of over 13,000 clones that match annotated genes for protein expression in cells or transgenic flies, and she has established the most comprehensive spatial gene expression atlas in any organism, with in situ imaging of more than 80% of the Drosophila protein-coding transcriptome through embryogenesis. In addition to providing the research community with these invaluable resources and reagents, she continues to develop new tools and datasets for genetics researchers to explore the spatial and temporal control of gene expression.


July 7, 2019  |  

Lepidoptera genomes: current knowledge, gaps and future directions.

Butterflies and moths (Lepidoptera) are one of the most ecologically diverse and speciose insect orders. With recent advances in genomics, new Lepidoptera genomes are regularly being sequenced, and many of them are playing principal roles in genomics studies, particularly in the fields of phylo-genomics and functional genomics. Thus far, assembled genomes are only available for <10 of the 43 Lepidoptera superfamilies. Nearly all are model species, found in the speciose clade Ditrysia. Community support for Lepidoptera genomics is growing with successful management and dissemination of data and analytical tools in centralized databases. With genomic studies quickly becoming integrated with ecological and evolutionary research, the Lepidoptera community will unquestionably benefit from new high-quality reference genomes that are more evenly distributed throughout the order. Copyright © 2018 Elsevier Inc. All rights reserved.


July 7, 2019  |  

Inferring synteny between genome assemblies: a systematic evaluation.

Genome assemblies across all domains of life are being produced routinely. Initial analysis of a new genome usually includes annotation and comparative genomics. Synteny provides a framework in which conservation of homologous genes and gene order is identified between genomes of different species. The availability of human and mouse genomes paved the way for algorithm development in large-scale synteny mapping, which eventually became an integral part of comparative genomics. Synteny analysis is regularly performed on assembled sequences that are fragmented, neglecting the fact that most methods were developed using complete genomes. It is unknown to what extent draft assemblies lead to errors in such analysis.We fragmented genome assemblies of model nematodes to various extents and conducted synteny identification and downstream analysis. We first show that synteny between species can be underestimated up to 40% and find disagreements between popular tools that infer synteny blocks. This inconsistency and further demonstration of erroneous gene ontology enrichment tests raise questions about the robustness of previous synteny analysis when gold standard genome sequences remain limited. In addition, assembly scaffolding using a reference guided approach with a closely related species may result in chimeric scaffolds with inflated assembly metrics if a true evolutionary relationship was overlooked. Annotation quality, however, has minimal effect on synteny if the assembled genome is highly contiguous.Our results show that a minimum N50 of 1 Mb is required for robust downstream synteny analysis, which emphasizes the importance of gold standard genomes to the science community, and should be achieved given the current progress in sequencing technology.


July 7, 2019  |  

Supergene evolution triggered by the introgression of a chromosomal inversion.

Supergenes are groups of tightly linked loci whose variation is inherited as a single Mendelian locus and are a common genetic architecture for complex traits under balancing selection [1-8]. Supergene alleles are long-range haplotypes with numerous mutations underlying distinct adaptive strategies, often maintained in linkage disequilibrium through the suppression of recombination by chromosomal rearrangements [1, 5, 7-9]. However, the mechanism governing the formation of supergenes is not well understood and poses the paradox of establishing divergent functional haplotypes in the face of recombination. Here, we show that the formation of the supergene alleles encoding mimicry polymorphism in the butterfly Heliconius numata is associated with the introgression of a divergent, inverted chromosomal segment. Haplotype divergence and linkage disequilibrium indicate that supergene alleles, each allowing precise wing-pattern resemblance to distinct butterfly models, originate from over a million years of independent chromosomal evolution in separate lineages. These “superalleles” have evolved from a chromosomal inversion captured by introgression and maintained in balanced polymorphism, triggering supergene inheritance. This mode of evolution involving the introgression of a chromosomal rearrangement is likely to be a common feature of complex structural polymorphisms associated with the coexistence of distinct adaptive syndromes. This shows that the reticulation of genealogies may have a powerful influence on the evolution of genetic architectures in nature. Copyright © 2018 Elsevier Ltd. All rights reserved.


July 7, 2019  |  

The sequenced angiosperm genomes and genome databases.

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.


July 7, 2019  |  

Identification of repetitive DNA sequences in the Chrysanthemum boreale genome

We previously revealed that the Chrysanthemum boreale genome is highly repetitive; however, the types and nucleotide sequences of repetitive DNA in this diploid wild chrysanthemum are not known. Here, we characterized repetitive DNA sequences in the C. boreale genome by analysing genomic sequences obtained by Illumina sequencing and confirmed their repetitive nature by conducting fluorescence in situ hybridization (FISH) analyses. Annotation of the obtained DNA sequences revealed that microsatellite-containing genomic sequences exhibited similarity with genomic sequences in Chrysanthemum morifolium, indicating sequence conservation of repetitive DNA sequences between the two Chrysanthemum species. Two superfamilies of repetitive DNA, Copia and Gypsy, belonging to the long-terminal repeat (LTR) class of retrotransposons, are abundant in the C. boreale genome. We propose that Copia and Gypsy retroelements contribute to the current genome architecture of C. boreale. Whole genome sequencing, which is currently in progress, will reveal the extent to which these repetitive DNA sequences contribute.


July 7, 2019  |  

Tracing the de novo origin of protein-coding genes in yeast.

De novo genes are very important for evolutionary innovation. However, how these genes originate and spread remains largely unknown. To better understand this, we rigorously searched for de novo genes in Saccharomyces cerevisiae S288C and examined their spread and fixation in the population. Here, we identified 84 de novo genes in S. cerevisiae S288C since the divergence with their sister groups. Transcriptome and ribosome profiling data revealed at least 8 (10%) and 28 (33%) de novo genes being expressed and translated only under specific conditions, respectively. DNA microarray data, based on 2-fold change, showed that 87% of the de novo genes are regulated during various biological processes, such as nutrient utilization and sporulation. Our comparative and evolutionary analyses further revealed that some factors, including single nucleotide polymorphism (SNP)/indel mutation, high GC content, and DNA shuffling, contribute to the birth of de novo genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we also provide evidence suggesting the possible parallel origin of a de novo gene between S. cerevisiae and Saccharomyces paradoxus Together, our study provides several new insights into the origin and spread of de novo genes.IMPORTANCE Emergence of de novo genes has occurred in many lineages during evolution, but the birth, spread, and function of these genes remain unresolved. Here we have searched for de novo genes from Saccharomyces cerevisiae S288C using rigorous methods, which reduced the effects of bad annotation and genomic gaps on the identification of de novo genes. Through this analysis, we have found 84 new genes originating de novo from previously noncoding regions, 87% of which are very likely involved in various biological processes. We noticed that 10% and 33% of de novo genes were only expressed and translated under specific conditions, therefore, verification of de novo genes through transcriptome and ribosome profiling, especially from limited expression data, may underestimate the number of bona fide new genes. We further show that SNP/indel mutation, high GC content, and DNA shuffling could be involved in the birth of de novo genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we provide evidence suggesting the possible parallel origin of a new gene. Copyright © 2018 Wu and Knudson.


July 7, 2019  |  

An improved approach for reconstructing consensus repeats from short sequence reads

Repeat elements are important components of most eukaryotic genomes. Most existing tools for repeat analysis rely either on high quality reference genomes or existing repeat libraries. Thus, it is still challenging to do repeat analysis for species with highly repetitive or complex genomes which often do not have good reference genomes or annotated repeat libraries. Recently we developed a computational method called REPdenovo that constructs consensus repeat sequences directly from short sequence reads, which outperforms an existing tool called RepARK. One major issue with REPdenovo is that it doesn’t perform well for repeats with relatively high divergence rates or low copy numbers. In this paper, we present an improved approach for constructing consensus repeats directly from short reads. Comparing with the original REPdenovo, the improved approach uses more repeat-related k-mers and improves repeat assembly quality using a consensus-based k-mer processing method.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.