Menu
July 7, 2019

Variant review with the Integrative Genomics Viewer.

Manual review of aligned reads for confirmation and interpretation of variant calls is an important step in many variant calling pipelines for next-generation sequencing (NGS) data. Visual inspection can greatly increase the confidence in calls, reduce the risk of false positives, and help characterize complex events. The Integrative Genomics Viewer (IGV) was one of the first tools to provide NGS data visualization, and it currently provides a rich set of tools for inspection, validation, and interpretation of NGS datasets, as well as other types of genomic data. Here, we present a short overview of IGV’s variant review features for both single-nucleotide variants and structural variants, with examples from both cancer and germline datasets. IGV is freely available at https://www.igv.org Cancer Res; 77(21); e31-34. ©2017 AACR.©2017 American Association for Cancer Research.


July 7, 2019

Tools for annotation and comparison of structural variation.

The impact of structural variants (SVs) on a variety of organisms and diseases like cancer has become increasingly evident. Methods for SV detection when studying genomic differences across cells, individuals or populations are being actively developed. Currently, just a few methods are available to compare different SVs callsets, and no specialized methods are available to annotate SVs that account for the unique characteristics of these variant types. Here, we introduce SURVIVOR_ant, a tool that compares types and breakpoints for candidate SVs from different callsets and enables fast comparison of SVs to genomic features such as genes and repetitive regions, as well as to previously established SV datasets such as from the 1000 Genomes Project. As proof of concept we compared 16 SV callsets generated by different SV calling methods on a single genome, the Genome in a Bottle sample HG002 (Ashkenazi son), and annotated the SVs with gene annotations, 1000 Genomes Project SV calls, and four different types of repetitive regions. Computation time to annotate 134,528 SVs with 33,954 of annotations was 22 seconds on a laptop.


July 7, 2019

Interrogating the “unsequenceable” genomic trinucleotide repeat disorders by long-read sequencing.

Microsatellite expansion, such as trinucleotide repeat expansion (TRE), is known to cause a number of genetic diseases. Sanger sequencing and next-generation short-read sequencing are unable to interrogate TRE reliably. We developed a novel algorithm called RepeatHMM to estimate repeat counts from long-read sequencing data. Evaluation on simulation data, real amplicon sequencing data on two repeat expansion disorders, and whole-genome sequencing data generated by PacBio and Oxford Nanopore technologies showed superior performance over competing approaches. We concluded that long-read sequencing coupled with RepeatHMM can estimate repeat counts on microsatellites and can interrogate the “unsequenceable” genomic trinucleotide repeat disorders.


July 7, 2019

Hunting structural variants: Population by population

Until recently, most population-scale genome sequencing studies have focused on identifying single nucleotide variants (SNVs) to explore genetic differences between individuals. Like so many SNV-based genome-wide association studies, however, these efforts have had difficulty identifying causative genetic mechanisms underlying most complex functions. More and more, the genomics community has realised that structural variation is likely responsible for many of the traits and phenotypes that scientists have not been able to attribute to SNVs. This class of variants, defined as genetic differences of 50 bp or larger, accounts for most of the DNA sequence differences between any two people. Structural variants (SVs) are also already known to cause many common and rare diseases including ALS, schizophrenia, leukemia, Carney complex, and Huntington’s disease. Despite the importance of SVs, these larger variants have been understudied and underreported compared to their single-nucleotide counterparts. One reason is that they remain difficult to detect. Their length often means they cannot be fully spanned using short sequencing reads. They also often occur in highly repetitive or GC-rich regions of the genome, making them challenging targets. As such, this class of human genetic variation has remained vastly under-explored in global populations and is now ripe for discovery.


July 7, 2019

Hybrid de novo genome assembly and centromere characterization of the gray mouse lemur (Microcebus murinus).

The de novo assembly of repeat-rich mammalian genomes using only high-throughput short read sequencing data typically results in highly fragmented genome assemblies that limit downstream applications. Here, we present an iterative approach to hybrid de novo genome assembly that incorporates datasets stemming from multiple genomic technologies and methods. We used this approach to improve the gray mouse lemur (Microcebus murinus) genome from early draft status to a near chromosome-scale assembly.We used a combination of advanced genomic technologies to iteratively resolve conflicts and super-scaffold the M. murinus genome.We improved the M. murinus genome assembly to a scaffold N50 of 93.32 Mb. Whole genome alignments between our primary super-scaffolds and 23 human chromosomes revealed patterns that are congruent with historical comparative cytogenetic data, thus demonstrating the accuracy of our de novo scaffolding approach and allowing assignment of scaffolds to M. murinus chromosomes. Moreover, we utilized our independent datasets to discover and characterize sequences associated with centromeres across the mouse lemur genome. Quality assessment of the final assembly found 96% of mouse lemur canonical transcripts nearly complete, comparable to other published high-quality reference genome assemblies.We describe a new assembly of the gray mouse lemur (Microcebus murinus) genome with chromosome-scale scaffolds produced using a hybrid bioinformatic and sequencing approach. The approach is cost effective and produces superior results based on metrics of contiguity and completeness. Our results show that emerging genomic technologies can be used in combination to characterize centromeres of non-model species and to produce accurate de novo chromosome-scale genome assemblies of complex mammalian genomes.


July 7, 2019

Detection of complex structural variation from paired-end sequencing data

Detecting structural variants (SVs) from sequencing data is a key problem in genome analysis, but the full diversity of SVs is not captured by most methods. We introduce the Automated Reconstruction of Complex Structural Variants (ARC-SV) method, which detects a broad class of structural variants from paired-end whole genome sequencing (WGS) data. Analysis of samples from NA12878 and HuRef suggests that complex SVs are often misclassified by traditional methods. We validated our results both experimentally and by comparison to whole genome assembly and PacBio data; ARC-SV compares favorably to existing algorithms in general and gives state-of-the-art results on complex SV detection. By expanding the range of detectable SVs compared to commonly-used algorithms, ARC-SV allows additional information to be extracted from existing WGS data.


July 7, 2019

Comparative analysis of the radish genome with Brassica genomes

Raphanus sativus L. includes an annual root vegetable crop, radish, and diverse wild species. R. sativus has a long history of domestication, but its phylogenetic position in the tribe Brassiceae is controversial. A comprehensive analysis of the R. sativus genome will provide fundamental information about the structure of its genome, evolutionary features of polyploidy, and significant insight for phylogenetic delimitation of this species. Diverse genomic resources, including a high-density genetic map, clone libraries, cytogenetic data, and transcriptome data, have been developed to sequence the genome. Recently, the R. sativus cv. ‘WK10039’ (2n = 18, 510.8 Mb) genome was sequenced and assembled into nine chromosome pseudomolecules spanning >98% of the gene space. Comparative mapping of the tPCK-like ancestral genome based on conserved ortholog set markers and proteome comparison revealed that the R. sativus genome has intermediate characteristics between the Brassica A/C and B genomes with triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between R. sativus and diploid Brassica species provide genomic evidence for species delimitation of R. sativus and reconstruction of the mesohexaploid ancestral genome.


July 7, 2019

Genomics of parallel adaptation at two timescales in Drosophila.

Two interesting unanswered questions are the extent to which both the broad patterns and genetic details of adaptive divergence are repeatable across species, and the timescales over which parallel adaptation may be observed. Drosophila melanogaster is a key model system for population and evolutionary genomics. Findings from genetics and genomics suggest that recent adaptation to latitudinal environmental variation (on the timescale of hundreds or thousands of years) associated with Out-of-Africa colonization plays an important role in maintaining biological variation in the species. Additionally, studies of interspecific differences between D. melanogaster and its sister species D. simulans have revealed that a substantial proportion of proteins and amino acid residues exhibit adaptive divergence on a roughly few million years long timescale. Here we use population genomic approaches to attack the problem of parallelism between D. melanogaster and a highly diverged conger, D. hydei, on two timescales. D. hydei, a member of the repleta group of Drosophila, is similar to D. melanogaster, in that it too appears to be a recently cosmopolitan species and recent colonizer of high latitude environments. We observed parallelism both for genes exhibiting latitudinal allele frequency differentiation within species and for genes exhibiting recurrent adaptive protein divergence between species. Greater parallelism was observed for long-term adaptive protein evolution and this parallelism includes not only the specific genes/proteins that exhibit adaptive evolution, but extends even to the magnitudes of the selective effects on interspecific protein differences. Thus, despite the roughly 50 million years of time separating D. melanogaster and D. hydei, and despite their considerably divergent biology, they exhibit substantial parallelism, suggesting the existence of a fundamental predictability of adaptive evolution in the genus.


July 7, 2019

Genetic maps and whole genome sequences of radish

Radish, Raphanus sativus L., is a member of Brassicaceae, to which Arabidopsis thaliana, a model plant in plant biology, belongs, as do other Brassica species including important crops. However, genetic and genomic studies of radish have been behind those of Arabidopsis and Brassica. In this decade, much effort has been made to develop genetic resources for radish, e.g., DNA markers, genetic maps, and whole genome sequences. Studies using the obtained information have revealed the genome structure of radish in terms of ancestral karyotype and have also prompted the identification of genes for agronomically important traits in radish through a map-based cloning strategy and quantitative trait locus analysis. In this chapter, we review the evolving development of radish genetic map in the past 15 years and the current status of genome sequencing of radish. We also introduce the latest strategy for the construction of a high-density genetic map using next-generation sequencing technology and propose a prospective direction of genetics and genomics research in radish which would be helpful for researchers and breeders in their efforts to promote radish breeding programs efficiently.


July 7, 2019

Insights into land plant evolution garnered from the Marchantia polymorpha genome.

The evolution of land flora transformed the terrestrial environment. Land plants evolved from an ancestral charophycean alga from which they inherited developmental, biochemical, and cell biological attributes. Additional biochemical and physiological adaptations to land, and a life cycle with an alternation between multicellular haploid and diploid generations that facilitated efficient dispersal of desiccation tolerant spores, evolved in the ancestral land plant. We analyzed the genome of the liverwort Marchantia polymorpha, a member of a basal land plant lineage. Relative to charophycean algae, land plant genomes are characterized by genes encoding novel biochemical pathways, new phytohormone signaling pathways (notably auxin), expanded repertoires of signaling pathways, and increased diversity in some transcription factor families. Compared with other sequenced land plants, M. polymorpha exhibits low genetic redundancy in most regulatory pathways, with this portion of its genome resembling that predicted for the ancestral land plant. PAPERCLIP. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.


July 7, 2019

Contributions of Zea mays subspecies mexicana haplotypes to modern maize.

Maize was domesticated from lowland teosinte (Zea mays ssp. parviglumis), but the contribution of highland teosinte (Zea mays ssp. mexicana, hereafter mexicana) to modern maize is not clear. Here, two genomes for Mo17 (a modern maize inbred) and mexicana are assembled using a meta-assembly strategy after sequencing of 10 lines derived from a maize-teosinte cross. Comparative analyses reveal a high level of diversity between Mo17, B73, and mexicana, including three Mb-size structural rearrangements. The maize spontaneous mutation rate is estimated to be 2.17?×?10-8 ~3.87?×?10-8 per site per generation with a nonrandom distribution across the genome. A higher deleterious mutation rate is observed in the pericentromeric regions, and might be caused by differences in recombination frequency. Over 10% of the maize genome shows evidence of introgression from the mexicana genome, suggesting that mexicana contributed to maize adaptation and improvement. Our data offer a rich resource for constructing the pan-genome of Zea mays and genetic improvement of modern maize varieties.


July 7, 2019

Genome sequence of the small brown planthopper, Laodelphax striatellus.

Laodelphax striatellus Fallén (Hemiptera: Delphacidae) is one of the most destructive rice pests. L. striatellus is different from 2 other rice planthoppers with a released genome sequence, Sogatella furcifera and Nilaparvata lugens, in many biological characteristics, such as host range, dispersal capacity, and vectoring plant viruses. Deciphering the genome of L. striatellus will further the understanding of the genetic basis of the biological differences among the 3 rice planthoppers.A total of 190 Gb of Illumina data and 32.4 Gb of Pacbio data were generated and used to assemble a high-quality L. striatellus genome sequence, which is 541 Mb in length and has a contig N50 of 118 Kb and a scaffold N50 of 1.08 Mb. Annotated repetitive elements account for 25.7% of the genome. A total of 17?736 protein-coding genes were annotated, capturing 97.6% and 98% of the BUSCO eukaryote and arthropoda genes, respectively. Compared with N. lugens and S. furcifera, L. striatellus has the smallest genome and the lowest gene number. Gene family expansion and transcriptomic analyses provided hints to the genomic basis of the differences in important traits such as host range, migratory habit, and plant virus transmission between L. striatellus and the other 2 planthoppers.We report a high-quality genome assembly of L. striatellus, which is an important genomic resource not only for the study of the biology of L. striatellus and its interactions with plant hosts and plant viruses, but also for comparison with other planthoppers.© The Authors 2017. Published by Oxford University Press.


July 7, 2019

Hidden genetic variation shapes the structure of functional elements in Drosophila.

Mutations that add, subtract, rearrange, or otherwise refashion genome structure often affect phenotypes, although the fragmented nature of most contemporary assemblies obscures them. To discover such mutations, we assembled the first new reference-quality genome of Drosophila melanogaster since its initial sequencing. By comparing this new genome to the existing D. melanogaster assembly, we created a structural variant map of unprecedented resolution and identified extensive genetic variation that has remained hidden until now. Many of these variants constitute candidates underlying phenotypic variation, including tandem duplications and a transposable element insertion that amplifies the expression of detoxification-related genes associated with nicotine resistance. The abundance of important genetic variation that still evades discovery highlights how crucial high-quality reference genomes are to deciphering phenotypes.


July 7, 2019

Assembly of an early-matured japonica (Geng) rice genome, Suijing18, based on PacBio and Illumina sequencing.

The early-matured japonica (Geng) rice variety, Suijing18 (SJ18), carries multiple elite traits including durable blast resistance, good grain quality, and high yield. Using PacBio SMRT technology, we produced over 25?Gb of long-read sequencing raw data from SJ18 with a coverage of 62×. Using Illumina paired-end whole-genome shotgun sequencing technology, we generated 59?Gb of short-read sequencing data from SJ18 (23.6?Gb from a 200?bp library with a coverage of 59× and 35.4?Gb from an 800?bp library with a coverage of 88×). With these data, we assembled a single SJ18 genome and then generated a set of annotation data. These data sets can be used to test new programs for variation deep mining, and will provide new insights into the genome structure, function, and evolution of SJ18, and will provide essential support for biological research in general.


July 7, 2019

A new species of Xenoturbella from the western Pacific Ocean and the evolution of Xenoturbella.

Xenoturbella is a group of marine benthic animals lacking an anus and a centralized nervous system. Molecular phylogenetic analyses group the animal together with the Acoelomorpha, forming the Xenacoelomorpha. This group has been suggested to be either a sister group to the Nephrozoa or a deuterostome, and therefore it may provide important insights into origins of bilaterian traits such as an anus, the nephron, feeding larvae and centralized nervous systems. However, only five Xenoturbella species have been reported and the evolutionary history of xenoturbellids and Xenacoelomorpha remains obscure.Here we describe a new Xenoturbella species from the western Pacific Ocean, and report a new xenoturbellid structure – the frontal pore. Non-destructive microCT was used to investigate the internal morphology of this soft-bodied animal. This revealed the presence of a frontal pore that is continuous with the ventral glandular network and which exhibits similarities with the frontal organ in acoelomorphs.Our results suggest that large size, oval mouth, frontal pore and ventral glandular network may be ancestral features for Xenoturbella. Further studies will clarify the evolutionary relationship of the frontal pore and ventral glandular network of xenoturbellids and the acoelomorph frontal organ. One of the habitats of the newly identified species is easily accessible from a marine station and so this species promises to be valuable for research on bilaterian and deuterostome evolution.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.