Menu
July 19, 2019

The mitochondrial genome map of Nelumbo nucifera reveals ancient evolutionary features.

Nelumbo nucifera is an evolutionary relic from the Late Cretaceous period. Sequencing the N. nucifera mitochondrial genome is important for elucidating the evolutionary characteristics of basal eudicots. Here, the N. nucifera mitochondrial genome was sequenced using single molecule real-time sequencing technology (SMRT), and the mitochondrial genome map was constructed after de novo assembly and annotation. The results showed that the 524,797-bp N. nucifera mitochondrial genome has a total of 63 genes, including 40 protein-coding genes, three rRNA genes and 20 tRNA genes. Fifteen collinear gene clusters were conserved across different plant species. Approximately 700 RNA editing sites in the protein-coding genes were identified. Positively selected genes were identified with selection pressure analysis. Nineteen chloroplast-derived fragments were identified, and seven tRNAs were derived from the chloroplast. These results suggest that the N. nucifera mitochondrial genome retains evolutionarily conserved characteristics, including ancient gene content and gene clusters, high levels of RNA editing, and low levels of chloroplast-derived fragment insertions. As the first publicly available basal eudicot mitochondrial genome, the N. nucifera mitochondrial genome facilitates further analysis of the characteristics of basal eudicots and provides clues of the evolutionary trajectory from basal angiosperms to advanced eudicots.


July 19, 2019

Emergence of a Homo sapiens-specific gene family and chromosome 16p11.2 CNV susceptibility.

Genetic differences that specify unique aspects of human evolution have typically been identified by comparative analyses between the genomes of humans and closely related primates, including more recently the genomes of archaic hominins. Not all regions of the genome, however, are equally amenable to such study. Recurrent copy number variation (CNV) at chromosome 16p11.2 accounts for approximately 1% of cases of autism and is mediated by a complex set of segmental duplications, many of which arose recently during human evolution. Here we reconstruct the evolutionary history of the locus and identify bolA family member 2 (BOLA2) as a gene duplicated exclusively in Homo sapiens. We estimate that a 95-kilobase-pair segment containing BOLA2 duplicated across the critical region approximately 282 thousand years ago (ka), one of the latest among a series of genomic changes that dramatically restructured the locus during hominid evolution. All humans examined carried one or more copies of the duplication, which nearly fixed early in the human lineage-a pattern unlikely to have arisen so rapidly in the absence of selection (P?


July 19, 2019

Extensive sequence divergence between the reference genomes of two elite indica rice varieties Zhenshan 97 and Minghui 63.

Asian cultivated rice consists of two subspecies: Oryza sativa subsp. indica and O. sativa subsp. japonica Despite the fact that indica rice accounts for over 70% of total rice production worldwide and is genetically much more diverse, a high-quality reference genome for indica rice has yet to be published. We conducted map-based sequencing of two indica rice lines, Zhenshan 97 (ZS97) and Minghui 63 (MH63), which represent the two major varietal groups of the indica subspecies and are the parents of an elite Chinese hybrid. The genome sequences were assembled into 237 (ZS97) and 181 (MH63) contigs, with an accuracy >99.99%, and covered 90.6% and 93.2% of their estimated genome sizes. Comparative analyses of these two indica genomes uncovered surprising structural differences, especially with respect to inversions, translocations, presence/absence variations, and segmental duplications. Approximately 42% of nontransposable element related genes were identical between the two genomes. Transcriptome analysis of three tissues showed that 1,059-2,217 more genes were expressed in the hybrid than in the parents and that the expressed genes in the hybrid were much more diverse due to their divergence between the parental genomes. The public availability of two high-quality reference genomes for the indica subspecies of rice will have large-ranging implications for plant biology and crop genetic improvement.


July 19, 2019

Long read sequencing technology to solve complex genomic regions assembly in plants

Background: Numerous completed or on-going whole genome sequencing projects have highlighted the fact that obtaining a high quality genome sequence is necessary to address comparative genomics questions such as structural variations among genotypes and gain or loss of specific function. Despite the spectacular progress that has been made in sequencing technologies, obtaining accurate and reliable data is still a challenge, both at the whole genome scale and when targeting specific genomic regions. These problems are even more noticeable for complex plant genomes. Most plant genomes are known to be particularly challenging due to their size, high density of repetitive elements and various levels of ploidy. To overcome these problems, we have developed a strategy to reduce genome complexity by using the large insert BAC libraries combined with next generation sequencing technologies. Results: We compared two different technologies (Roche-454 and Pacific Biosciences PacBio RS II) to sequence pools of BAC clones in order to obtain the best quality sequence. We targeted nine BAC clones from different species (maize, wheat, strawberry, barley, sugarcane and sunflower) known to be complex in terms of sequence assembly. We sequenced the pools of the nine BAC clones with both technologies. We compared assembly results and highlighted differences due to the sequencing technologies used. Conclusions: We demonstrated that the long reads obtained with the PacBio RS II technology serve to obtain a better and more reliable assembly, notably by preventing errors due to duplicated or repetitive sequences in the same region.


July 19, 2019

Variation and evolution in the glutamine-rich repeat region of Drosophila argonaute-2.

RNA interference pathways mediate biological processes through Argonaute-family proteins, which bind small RNAs as guides to silence complementary target nucleic acids . In insects and crustaceans Argonaute-2 silences viral nucleic acids, and therefore acts as a primary effector of innate antiviral immunity. Although the function of the major Argonaute-2 domains, which are conserved across most Argonaute-family proteins, are known, many invertebrate Argonaute-2 homologs contain a glutamine-rich repeat (GRR) region of unknown function at the N-terminus . Here we combine long-read amplicon sequencing of Drosophila Genetic Reference Panel (DGRP) lines with publicly available sequence data from many insect species to show that this region evolves extremely rapidly and is hyper-variable within species. We identify distinct GRR haplotype groups in Drosophila melanogaster, and suggest that one of these haplotype groups has recently risen to high frequency in a North American population. Finally, we use published data from genome-wide association studies of viral resistance in D. melanogaster to test whether GRR haplotypes are associated with survival after virus challenge. We find a marginally significant association with survival after challenge with Drosophila C Virus in the DGRP, but we were unable to replicate this finding using lines from the Drosophila Synthetic Population Resource panel. Copyright © 2016 Palmer and Obbard.


July 19, 2019

Living apart together: crosstalk between the core and supernumerary genomes in a fungal plant pathogen.

Eukaryotes display remarkable genome plasticity, which can include supernumerary chromosomes that differ markedly from the core chromosomes. Despite the widespread occurrence of supernumerary chromosomes in fungi, their origin, relation to the core genome and the reason for their divergent characteristics are still largely unknown. The complexity of genome assembly due to the presence of repetitive DNA partially accounts for this.Here we use single-molecule real-time (SMRT) sequencing to assemble the genome of a prominent fungal wheat pathogen, Fusarium poae, including at least one supernumerary chromosome. The core genome contains limited transposable elements (TEs) and no gene duplications, while the supernumerary genome holds up to 25 % TEs and multiple gene duplications. The core genome shows all hallmarks of repeat-induced point mutation (RIP), a defense mechanism against TEs, specific for fungi. The absence of RIP on the supernumerary genome accounts for the differences between the two (sub)genomes, and results in a functional crosstalk between them. The supernumerary genome is a reservoir for TEs that migrate to the core genome, and even large blocks of supernumerary sequence (>200 kb) have recently translocated to the core. Vice versa, the supernumerary genome acts as a refuge for genes that are duplicated from the core genome.For the first time, a mechanism was determined that explains the differences that exist between the core and supernumerary genome in fungi. Different biology rather than origin was shown to be responsible. A “living apart together” crosstalk exists between the core and supernumerary genome, accelerating chromosomal and organismal evolution.


July 19, 2019

Condition-dependent co-regulation of genomic clusters of virulence factors in the grapevine trunk pathogen Neofusicoccum parvum.

The ascomycete Neofusicoccum parvum, one of the causal agents of Botryosphaeria dieback, is a destructive wood-infecting fungus and a serious threat to grape production worldwide. The capability to colonize woody tissue, combined with the secretion of phytotoxic compounds, is thought to underlie its pathogenicity and virulence. Here, we describe the repertoire of virulence factors and their transcriptional dynamics as the fungus feeds on different substrates and colonizes the woody stem. We assembled and annotated a highly contiguous genome using single-molecule real-time DNA sequencing. Transcriptome profiling by RNA sequencing determined the genome-wide patterns of expression of virulence factors both in vitro (potato dextrose agar or medium amended with grape wood as substrate) and in planta. Pairwise statistical testing of differential expression, followed by co-expression network analysis, revealed that physically clustered genes coding for putative virulence functions were induced depending on the substrate or stage of plant infection. Co-expressed gene clusters were significantly enriched not only in genes associated with secondary metabolism, but also in those associated with cell wall degradation, suggesting that dynamic co-regulation of transcriptional networks contributes to multiple aspects of N. parvum virulence. In most of the co-expressed clusters, all genes shared at least a common motif in their promoter region, indicative of co-regulation by the same transcription factor. Co-expression analysis also identified chromatin regulators with correlated expression with inducible clusters of virulence factors, suggesting a complex, multi-layered regulation of the virulence repertoire of N. parvum.© 2016 BSPP AND JOHN WILEY & SONS LTD.


July 19, 2019

Winding paths to simplicity: genome evolution in facultative insect symbionts.

Symbiosis between organisms is an important driving force in evolution. Among the diverse relationships described, extensive progress has been made in insect-bacteria symbiosis, which improved our understanding of the genome evolution in host-associated bacteria. Particularly, investigations on several obligate mutualists have pushed the limits of what we know about the minimal genomes for sustaining cellular life. To bridge the gap between those obligate symbionts with extremely reduced genomes and their non-host-restricted ancestors, this review focuses on the recent progress in genome characterization of facultative insect symbionts. Notable cases representing various types and stages of host associations, including those from multiple genera in the family Enterobacteriaceae (class Gammaproteobacteria), Wolbachia (Alphaproteobacteria) and Spiroplasma (Mollicutes), are discussed. Although several general patterns of genome reduction associated with the adoption of symbiotic relationships could be identified, extensive variation was found among these facultative symbionts. These findings are incorporated into the established conceptual frameworks to develop a more detailed evolutionary model for the discussion of possible trajectories. In summary, transitions from facultative to obligate symbiosis do not appear to be a universal one-way street; switches between hosts and lifestyles (e.g. commensalism, parasitism or mutualism) occur frequently and could be facilitated by horizontal gene transfer. © FEMS 2016.


July 19, 2019

Extraction of high-molecular-weight genomic DNA for long-read sequencing of single molecules.

De novo sequencing of complex genomes is one of the main challenges for researchers seeking high-quality reference sequences. Many de novo assemblies are based on short reads, producing fragmented genome sequences. Third-generation sequencing, with read lengths >10 kb, will improve the assembly of complex genomes, but these techniques require high-molecular-weight genomic DNA (gDNA), and gDNA extraction protocols used for obtaining smaller fragments for short-read sequencing are not suitable for this purpose. Methods of preparing gDNA for bacterial artificial chromosome (BAC) libraries could be adapted, but these approaches are time-consuming, and commercial kits for these methods are expensive. Here, we present a protocol for rapid, inexpensive extraction of high-molecular-weight gDNA from bacteria, plants, and animals. Our technique was validated using sunflower leaf samples, producing a mean read length of 12.6 kb and a maximum read length of 80 kb.


July 19, 2019

Short tandem repeats, segmental duplications, gene deletion, and genomic instability in a rapidly diversified immune gene family.

Genomic regions with repetitive sequences are considered unstable and prone to swift DNA diversification processes. A highly diverse immune gene family of the sea urchin (Strongylocentrotus purpuratus), called Sp185/333, is composed of clustered genes with similar sequence as well as several types of repeats ranging in size from short tandem repeats (STRs) to large segmental duplications. This repetitive structure may have been the basis for the incorrect assembly of this gene family in the sea urchin genome sequence. Consequently, we have resolved the structure of the family and profiled the members by sequencing selected BAC clones using Illumina and PacBio approaches.BAC insert assemblies identified 15 predicted genes that are organized into three clusters. Two of the gene clusters have almost identical flanking regions, suggesting that they may be non-matching allelic clusters residing at the same genomic locus. GA STRs surround all genes and appear in large stretches at locations of putatively deleted genes. GAT STRs are positioned at the edges of segmental duplications that include a subset of the genes. The unique locations of the STRs suggest their involvement in gene deletions and segmental duplications. Genomic profiling of the Sp185/333 gene diversity in 10 sea urchins shows that no gene repertoires are shared among individuals indicating a very high gene diversification rate for this family.The repetitive genomic structure of the Sp185/333 family that includes STRs in strategic locations may serve as platform for a controlled mechanism which regulates the processes of gene recombination, gene conversion, duplication and deletion. The outcome is genomic instability and allelic mismatches, which may further drive the swift diversification of the Sp185/333 gene family that may improve the immune fitness of the species.


July 19, 2019

Ribbon: Visualizing complex genome alignments and structural variation

Visualization has played an extremely important role in the current genomic revolution to inspect and understand variants, expression patterns, evolutionary changes, and a number of other relationships. However, most of the information in read-to-reference or genome-genome alignments is lost for structural variations in the one-dimensional views of most genome browsers showing only reference coordinates. Instead, structural variations captured by long reads or assembled contigs often need more context to understand, including alignments and other genomic information from multiple chromosomes. We have addressed this problem by creating Ribbon (genomeribbon.com) an interactive online visualization tool that displays alignments along both reference and query sequences, along with any associated variant calls in the sample. This way Ribbon shows patterns in alignments of many reads across multiple chromosomes, while allowing detailed inspection of individual reads (Supplementary Note 1). For example, here we show a gene fusion in the SK-BR-3 breast cancer cell line linking the genes CYTH1 and EIF3H. While it has been found in the transcriptome previously, genome sequencing did not identify a direct chromosomal fusion between these two genes. After SMRT sequencing, Ribbon shows that there are indeed long reads that span from one gene to the other, going through not one but two variants, for the first time showing the genomic link between these two genes (Figure 1a). More gene fusions of this cancer cell line are investigated in Supplementary Note 2. Figure 1b shows another complex event in this sample made simple in Ribbon: the translocation of a 4.4 kb sequence deleted from chr19 and inserted into chr16 (Figure 1b). Thus, Ribbon enables understanding of complex variants, and it may also help in the detection of sequencing and sample preparation issues, testing of aligners and variant-callers, and rapid curation of structural variant candidates (Supplementary Note 3). In addition to SAM and BAM files with long, short, or paired-end reads, Ribbon can also load coordinate files from whole genome aligners such as MUMmer. Therefore, Ribbon can be used to test assembly algorithms or inspect the similarity between species. Supplementary Note 4 shows a comparison of gorilla and human genomes using Ribbon, highlighting major structural differences. In conclusion, Ribbon is a powerful interactive web tool for viewing complex genomic alignments.


July 19, 2019

The deep origin and recent loss of venom toxin genes in rattlesnakes.

The genetic origin of novel traits is a central but challenging puzzle in evolutionary biology. Among snakes, phospholipase A2 (PLA2)-related toxins have evolved in different lineages to function as potent neurotoxins, myotoxins, or hemotoxins. Here, we traced the genomic origin and evolution of PLA2 toxins by examining PLA2 gene number, organization, and expression in both neurotoxic and non-neurotoxic rattlesnakes. We found that even though most North American rattlesnakes do not produce neurotoxins, the genes of a specialized heterodimeric neurotoxin predate the origin of rattlesnakes and were present in their last common ancestor (~22 mya). The neurotoxin genes were then deleted independently in the lineages leading to the Western Diamondback (Crotalus atrox) and Eastern Diamondback (C. adamanteus) rattlesnakes (~6 mya), while a PLA2 myotoxin gene retained in C. atrox was deleted from the neurotoxic Mojave rattlesnake (C. scutulatus; ~4 mya). The rapid evolution of PLA2 gene number appears to be due to transposon invasion that provided a template for non-allelic homologous recombination. Copyright © 2016 Elsevier Ltd. All rights reserved.


July 19, 2019

Rapid functional and sequence differentiation of a tandemly repeated species-specific multigene family in Drosophila.

Gene clusters of recently duplicated genes are hotbeds for evolutionary change. However, our understanding of how mutational mechanisms and evolutionary forces shape the structural and functional evolution of these clusters is hindered by the high sequence identity among the copies, which typically results in their inaccurate representation in genome assemblies. The presumed testis-specific, chimeric gene Sdic originated, and tandemly expanded in Drosophila melanogaster, contributing to increased male-male competition. Using various types of massively parallel sequencing data, we studied the organization, sequence evolution, and functional attributes of the different Sdic copies. By leveraging long-read sequencing data, we uncovered both copy number and order differences from the currently accepted annotation for the Sdic region. Despite evidence for pervasive gene conversion affecting the Sdic copies, we also detected signatures of two episodes of diversifying selection, which have contributed to the evolution of a variety of C-termini and miRNA binding site compositions. Expression analyses involving RNA-seq datasets from 59 different biological conditions revealed distinctive expression breadths among the copies, with three copies being transcribed in females, opening the possibility to a sexually antagonistic effect. Phenotypic assays using Sdic knock-out strains indicated that should this antagonistic effect exist, it does not compromise female fertility. Our results strongly suggest that the genome consolidation of the Sdic gene cluster is more the result of a quick exploration of different paths of molecular tinkering by different copies than a mere dosage increase, which could be a recurrent evolutionary outcome in the presence of persistent sexual selection. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


July 19, 2019

Targeted capture and sequencing of gene-sized DNA molecules.

Targeted capture provides an efficient and sensitive means for sequencing specific genomic regions in a high-throughput manner. To date, this method has mostly been used to capture exons from the genome (the exome) using short insert libraries and short-read sequencing technology, enabling the identification of genetic variants or new members of large gene families. Sequencing larger molecules results in the capture of whole genes, including intronic and intergenic sequences that are typically more polymorphic and allow the resolution of the gene structure of homologous genes, which are often clustered together on the chromosome. Here, we describe an improved method for the capture and single-molecule sequencing of DNA molecules as large as 7 kb by means of size selection and optimized PCR conditions. Our approach can be used to capture, sequence, and distinguish between similar members of the NB-LRR gene family-key genes in plant immune systems.


July 19, 2019

Methylome analysis of two Xanthomonas spp. using Single-Molecule Real-Time Sequencing.

Single-molecule real-time (SMRT) sequencing allows identification of methylated DNA bases and methylation patterns/motifs at the genome level. Using SMRT sequencing, diverse bacterial methylomes including those of Helicobacter pylori, Lactobacillus spp., and Escherichia coli have been determined, and previously unreported DNA methylation motifs have been identified. However, the methylomes of Xanthomonas species, which belong to the most important plant pathogenic bacterial genus, have not been documented. Here, we report the methylomes of Xanthomonas axonopodis pv. glycines (Xag) strain 8ra and X. campestris pv. vesicatoria (Xcv) strain 85-10. We identified N(6)-methyladenine (6mA) and N(4)-methylcytosine (4mC) modification in both genomes. In addition, we assigned putative DNA methylation motifs including previously unreported methylation motifs via REBASE and MotifMaker, and compared methylation patterns in both species. Although Xag and Xcv belong to the same genus, their methylation patterns were dramatically different. The number of 4mC DNA bases in Xag (66,682) was significantly higher (29 fold) than in Xcv (2,321). In contrast, the number of 6mA DNA bases (4,147) in Xag was comparable to the number in Xcv (5,491). Strikingly, there were no common or shared motifs in the 10 most frequently methylated motifs of both strains, indicating they possess unique species- or strain-specific methylation motifs. Among the 20 most frequent motifs from both strains, for 9 motifs at least 1% of the methylated bases were located in putative promoter regions. Methylome analysis by SMRT sequencing technology is the first step toward understanding the biology and functions of DNA methylation in this genus.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.