Menu
July 7, 2019

Structural variation offers new home for disease associations and gene discovery

Following completion of the Human Genome Project, most studies of human genetic variation have centered on single nucleotide polymorphisms (SNPs). SNPs are numerous in individual genomes and serve as useful genetic markers in association studies across a population. These markers have been leveraged to identify genetic loci for disease risk and draw associations with numerous traits of interest. Despite their usefulness, SNPs do not tell the whole story. For example, most SNPs are associated with only a small increased risk of disease, and they usually cannot identify on their own which genes are causal. This has resulted in what many researchers have referred to as missing or hidden heritability.


July 7, 2019

Lightning-fast genome variant detection with GROM.

Current human whole genome sequencing projects produce massive amounts of data, often creating significant computational challenges. Different approaches have been developed for each type of genome variant and method of its detection, necessitating users to run multiple algorithms to find variants.We present GROM (Genome Rearrangement OmniMapper), a novel comprehensive variant detection algorithm accepting aligned read files as input and finding SNVs, indels, structural variants (SVs), and copy number variants (CNVs). We show that GROM outperforms state-of-the-art methods on seven validated benchmarks using two whole genome sequencing (WGS) datasets. Additionally, GROM boasts lightning fast run times, analyzing a 50x WGS human dataset (NA12878) on commonly available computer hardware in 11 minutes, more than an order of magnitude (up to 72 times) faster than tools detecting a similar range of variants.Addressing the needs of big data analysis, GROM combines in one algorithm SNV, indel, SV, and CNV detection providing superior speed, sensitivity, and precision. GROM is also able to detect CNVs, SNVs and indels in non-paired read WGS libraries, as well as SNVs and indels in whole exome or RNA sequencing datasets.


July 7, 2019

The Tartary buckwheat genome provides insights into rutin biosynthesis and abiotic stress tolerance.

Tartary buckwheat (Fagopyrum tataricum) is an important pseudocereal crop that is strongly adapted to growth in adverse environments. Its gluten-free grain contains complete proteins with a well-balanced composition of essential amino acids and is a rich source of beneficial phytochemicals that provide significant health benefits. Here, we report a high-quality, chromosome-scale Tartary buckwheat genome sequence of 489.3 Mb that is assembled by combining whole-genome shotgun sequencing of both Illumina short reads and single-molecule real-time long reads, sequence tags of a large DNA insert fosmid library, Hi-C sequencing data, and BioNano genome maps. We annotated 33 366 high-confidence protein-coding genes based on expression evidence. Comparisons of the intra-genome with the sugar beet genome revealed an independent whole-genome duplication that occurred in the buckwheat lineage after they diverged from the common ancestor, which was not shared with rosids or asterids. The reference genome facilitated the identification of many new genes predicted to be involved in rutin biosynthesis and regulation, aluminum stress resistance, and in drought and cold stress responses. Our data suggest that Tartary buckwheat’s ability to tolerate high levels of abiotic stress is attributed to the expansion of several gene families involved in signal transduction, gene regulation, and membrane transport. The availability of these genomic resources will facilitate the discovery of agronomically and nutritionally important genes and genetic improvement of Tartary buckwheat. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.


July 7, 2019

Genome architecture and evolution of a unichromosomal asexual nematode.

Asexual reproduction in animals, though rare, is the main or exclusive mode of reproduction in some long-lived lineages. The longevity of asexual clades may be correlated with the maintenance of heterozygosity by mechanisms that rearrange genomes and reduce recombination. Asexual species thus provide an opportunity to gain insight into the relationship between molecular changes, genome architecture, and cellular processes. Here we report the genome sequence of the parthenogenetic nematode Diploscapter pachys with only one chromosome pair. We show that this unichromosomal architecture is shared by a long-lived clade of asexual nematodes closely related to the genetic model organism Caenorhabditis elegans. Analysis of the genome assembly reveals that the unitary chromosome arose through fusion of six ancestral chromosomes, with extensive rearrangement among neighboring regions. Typical nematode telomeres and telomeric protection-encoding genes are lacking. Most regions show significant heterozygosity; homozygosity is largely concentrated to one region and attributed to gene conversion. Cell-biological and molecular evidence is consistent with the absence of key features of meiosis I, including synapsis and recombination. We propose that D. pachys preserves heterozygosity and produces diploid embryos without fertilization through a truncated meiosis. As a prelude to functional studies, we demonstrate that D. pachys is amenable to experimental manipulation by RNA interference. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 7, 2019

Lightweight BWT and LCP merging via the gap algorithm

Recently, Holt and McMillan [Bioinformatics 2014, ACM-BCB 2014] have proposed a simple and elegant algorithm to merge the Burrows-Wheeler transforms of a collection of strings. In this paper we show that their algorithm can be improved so that, in addition to the BWTs, it also merges the Longest Common Prefix (LCP) arrays. Because of its small memory footprint this new algorithm can be used for the final merge of BWT and LCP arrays computed by a faster but memory intensive construction algorithm.


July 7, 2019

Sunflower leaf senescence: A complex genetic process with economic impact on crop production

Leaf senescence is a complex process controlled by multiple genetic and environmental variables. In different crops, a delay in leaf senescence has an important impact on grain yield trough the maintenance of the photosynthetic leaf area during the reproductive stage. In sunflower (Helianthus annuus L.), the fourth largest oil crop worldwide, senescence reduces the capacity of plants to maintain their green leaf area for longer periods, especially during the grain filling phase, leading to important economic losses. In crop species, taking into account the temporal gap between the onset and the phenotypic detection of senescence, identification of both, candidate genes and functional stay-green are indispensable to enable the early detection of senescence, the elucidation of molecular mechanisms and the development of tools for breeding applications. In this chapter a comprehensive literature revision of leaf senescence process not only in model plant species but also in agronomical relevant crops is presented. Results derived from system biology approaches integrating transcriptomic, metabolomic and physiological data as well as those leading to the selection and characterization of stay green sunflower genotypes are included, making an important contribution to the knowledge of leaf senescence process and providing a valuable tool to assist in crop breeding.


July 7, 2019

Molecular cloning and functional expression of the K(+) channel KV7.1 and the regulatory subunit KCNE1 from equine myocardium.

The voltage-gated K(+)-channel KV7.1 and the subunit KCNE1, encoded by the KCNQ1 and KCNE1 genes, respectively, are responsible for termination of the cardiac action potential. In humans, mutations in these genes can predispose patients to arrhythmias and sudden cardiac death (SCD).To characterize equine KV7.1/KCNE1 currents and compare them to human KV7.1/KCNE1 currents to determine whether KV7.1/KCNE1 plays a similar role in equine and human hearts.mRNA encoding KV7.1 and KCNE1 was isolated from equine hearts, sequenced, and cloned into expression vectors. The channel subunits were heterologously expressed in Xenopus laevis oocytes or CHO-K1 cells and characterized using voltage-clamp techniques.Equine KV7.1/KCNE1 expressed in CHO-K1 cells exhibited electrophysiological properties that are overall similar to the human orthologs; however, a slower deactivation was found which could result in more open channels at fast rates.The results suggest that the equine KV7.1/KCNE1 channel may be important for cardiac repolarization and this could indicate that horses are susceptible to SCD caused by mutations in KCNQ1 and KCNE1. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 7, 2019

New insights into structural organization and gene duplication in a 1.75-Mb genomic region harboring the a-gliadin gene family in Aegilops tauschii, the source of wheat D genome.

Among the wheat prolamins important for its end-use traits, a-gliadins are the most abundant, and are also a major cause of food-related allergies and intolerances. Previous studies of various wheat species estimated that between 25 and 150 a-gliadin genes reside in the Gli-2 locus regions. To better understand the evolution of this complex gene family, the DNA sequence of a 1.75-Mb genomic region spanning the Gli-2 locus was analyzed in the diploid grass, Aegilops tauschii, the ancestral source of D genome in hexaploid bread wheat. Comparison with orthologous regions from rice, sorghum, and Brachypodium revealed rapid and dynamic changes only occurring to the Ae. tauschii Gli-2 region, including insertions of high numbers of non-syntenic genes and a high rate of tandem gene duplications, the latter of which have given rise to 12 copies of a-gliadin genes clustered within a 550-kb region. Among them, five copies have undergone pseudogenization by various mutation events. Insights into the evolutionary relationship of the duplicated a-gliadin genes were obtained from their genomic organization, transcription patterns, transposable element insertions and phylogenetic analyses. An ancestral glutamate-like receptor (GLR) gene encoding putative amino acid sensor in all four grass species has duplicated only in Ae. tauschii and generated three more copies that are interspersed with the a-gliadin genes. Phylogenetic inference and different gene expression patterns support functional divergence of the Ae. tauschii GLR copies after duplication. Our results suggest that the duplicates of a-gliadin and GLR genes have likely taken different evolutionary paths; conservation for the former and neofunctionalization for the latter.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.


July 7, 2019

The unusual S locus of Leavenworthia is composed of two sets of paralogous loci.

The Leavenworthia self-incompatibility locus (S locus) consists of paralogs (Lal2, SCRL) of the canonical Brassicaceae S locus genes (SRK, SCR), and is situated in a genomic position that differs from the ancestral one in the Brassicaceae. Unexpectedly, in a small number of Leavenworthia alabamica plants examined, sequences closely resembling exon 1 of SRK have been found, but the function of these has remained unclear. BAC cloning and expression analyses were employed to characterize these SRK-like sequences. An SRK-positive Bacterial Artificial Chromosome clone was found to contain complete SRK and SCR sequences located close by one another in the derived genomic position of the Leavenworthia S locus, and in place of the more typical Lal2 and SCRL sequences. These sequences are expressed in stigmas and anthers, respectively, and crossing data show that the SRK/SCR haplotype is functional in self-incompatibility. Population surveys indicate that < 5% of Leavenworthia S loci possess such alleles. An ancestral translocation or recombination event involving SRK/SCR and Lal2/SCRL likely occurred, together with neofunctionalization of Lal2/SCRL, and both haplotype groups now function as Leavenworthia S locus alleles. These findings suggest that S locus alleles can have distinctly different evolutionary origins.© 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.


July 7, 2019

Draft nuclear genome sequence of the halophilic and beta-carotene-accumulating green alga Dunaliella salina strain CCAP19/18.

The halotolerant alga Dunaliella salina is a model for stress tolerance and is used commercially for production of beta-carotene (=pro-vitamin A). The presented draft genome of the genuine strain CCAP19/18 will allow investigations into metabolic processes involved in regulation of stress responses, including carotenogenesis and adaptations to life in high-salinity environments. Copyright © 2017 Polle et al.


July 7, 2019

GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly.

The identification of genomic rearrangements with high sensitivity and specificity using massively parallel sequencing remains a major challenge, particularly in precision medicine and cancer research. Here, we describe a new method for detecting rearrangements, GRIDSS (Genome Rearrangement IDentification Software Suite). GRIDSS is a multithreaded structural variant (SV) caller that performs efficient genome-wide break-end assembly prior to variant calling using a novel positional de Bruijn graph-based assembler. By combining assembly, split read, and read pair evidence using a probabilistic scoring, GRIDSS achieves high sensitivity and specificity on simulated, cell line, and patient tumor data, recently winning SV subchallenge #5 of the ICGC-TCGA DREAM8.5 Somatic Mutation Calling Challenge. On human cell line data, GRIDSS halves the false discovery rate compared to other recent methods while matching or exceeding their sensitivity. GRIDSS identifies nontemplate sequence insertions, microhomologies, and large imperfect homologies, estimates a quality score for each breakpoint, stratifies calls into high or low confidence, and supports multisample analysis.© 2017 Cameron et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

The sea cucumber genome provides insights into morphological evolution and visceral regeneration.

Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs.


July 7, 2019

Comparative analysis of mitochondrial genomes of geographic variants of the gypsy moth, Lymantria dispar, reveals a previously undescribed genotypic entity.

The gypsy moth, Lymantria dispar L., is one of the most destructive forest pests in the world. While the subspecies established in North America is the European gypsy moth (L. dispar dispar), whose females are flightless, the two Asian subspecies, L. dispar asiatica and L. dispar japonica, have flight-capable females, enhancing their invasiveness and warranting precautionary measures to prevent their permanent establishment in North America. Various molecular tools have been developed to help distinguish European from Asian subspecies, several of which are based on the mitochondrial barcode region. In an effort to identify additional informative markers, we undertook the sequencing and analysis of the mitogenomes of 10 geographic variants of L. dispar, including two or more variants of each subspecies, plus the closely related L. umbrosa as outgroup. Several regions of the gypsy moth mitogenomes displayed nucleotide substitutions with potential usefulness for the identification of subspecies and/or geographic origins. Interestingly, the mitogenome of one geographic variant displayed significant divergence relative to the remaining variants, raising questions about its taxonomic status. Phylogenetic analyses placed this population from northern Iran as basal to the L. dispar clades. The present findings will help improve diagnostic tests aimed at limiting risks of AGM invasions.


July 7, 2019

The asparagus genome sheds light on the origin and evolution of a young Y chromosome.

Sex chromosomes evolved from autosomes many times across the eukaryote phylogeny. Several models have been proposed to explain this transition, some involving male and female sterility mutations linked in a region of suppressed recombination between X and Y (or Z/W, U/V) chromosomes. Comparative and experimental analysis of a reference genome assembly for a double haploid YY male garden asparagus (Asparagus officinalis L.) individual implicates separate but linked genes as responsible for sex determination. Dioecy has evolved recently within Asparagus and sex chromosomes are cytogenetically identical with the Y, harboring a megabase segment that is missing from the X. We show that deletion of this entire region results in a male-to-female conversion, whereas loss of a single suppressor of female development drives male-to-hermaphrodite conversion. A single copy anther-specific gene with a male sterile Arabidopsis knockout phenotype is also in the Y-specific region, supporting a two-gene model for sex chromosome evolution.


July 7, 2019

Dense and accurate whole-chromosome haplotyping of individual genomes.

The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate chromosome-length haplotypes at reasonable costs. Here we introduce an integrative phasing strategy that combines global, but sparse haplotypes obtained from strand-specific single-cell sequencing (Strand-seq) with dense, yet local, haplotype information available through long-read or linked-read sequencing. We provide comprehensive guidance on the required sequencing depths and reliably assign more than 95% of alleles (NA12878) to their parental haplotypes using as few as 10 Strand-seq libraries in combination with 10-fold coverage PacBio data or, alternatively, 10X Genomics linked-read sequencing data. We conclude that the combination of Strand-seq with different technologies represents an attractive solution to chart the genetic variation of diploid genomes.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.