Menu
September 22, 2019

Evolutionary conservation of Y Chromosome ampliconic gene families despite extensive structural variation.

Despite claims that the mammalian Y Chromosome is on a path to extinction, comparative sequence analysis of primate Y Chromosomes has shown the decay of the ancestral single-copy genes has all but ceased in this eutherian lineage. The suite of single-copy Y-linked genes is highly conserved among the majority of eutherian Y Chromosomes due to strong purifying selection to retain dosage-sensitive genes. In contrast, the ampliconic regions of the Y Chromosome, which contain testis-specific genes that encode the majority of the transcripts on eutherian Y Chromosomes, are rapidly evolving and are thought to undergo species-specific turnover. However, ampliconic genes are known from only a handful of species, limiting insights into their long-term evolutionary dynamics. We used a clone-based sequencing approach employing both long- and short-read sequencing technologies to assemble ~2.4 Mb of representative ampliconic sequence dispersed across the domestic cat Y Chromosome, and identified the major ampliconic gene families and repeat units. We analyzed fluorescence in situ hybridization, qPCR, and whole-genome sequence data from 20 cat species and revealed that ampliconic gene families are conserved across the cat family Felidae but show high transcript diversity, copy number variation, and structural rearrangement. Our analysis of ampliconic gene evolution unveils a complex pattern of long-term gene content stability despite extensive structural variation on a nonrecombining background.© 2018 Brashear et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

Regulation of yeast-to-hyphae transition in Yarrowia lipolytica.

The yeast Yarrowia lipolytica undergoes a morphological transition from yeast-to-hyphal growth in response to environmental conditions. A forward genetic screen was used to identify mutants that reliably remain in the yeast phase, which were then assessed by whole-genome sequencing. All the smooth mutants identified, so named because of their colony morphology, exhibit independent loss of DNA at a repetitive locus made up of interspersed ribosomal DNA and short 10- to 40-mer telomere-like repeats. The loss of repetitive DNA is associated with downregulation of genes with stress response elements (5′-CCCCT-3′) and upregulation of genes with cell cycle box (5′-ACGCG-3′) motifs in their promoter region. The stress response element is bound by the transcription factor Msn2p in Saccharomyces cerevisiae We confirmed that the Y. lipolyticamsn2 (Ylmsn2) ortholog is required for hyphal growth and found that overexpression of Ylmsn2 enables hyphal growth in smooth strains. The cell cycle box is bound by the Mbp1p/Swi6p complex in S. cerevisiae to regulate G1-to-S phase progression. We found that overexpression of either the Ylmbp1 or Ylswi6 homologs decreased hyphal growth and that deletion of either Ylmbp1 or Ylswi6 promotes hyphal growth in smooth strains. A second forward genetic screen for reversion to hyphal growth was performed with the smooth-33 mutant to identify additional genetic factors regulating hyphal growth in Y. lipolytica Thirteen of the mutants sequenced from this screen had coding mutations in five kinases, including the histidine kinases Ylchk1 and Ylnik1 and kinases of the high-osmolarity glycerol response (HOG) mitogen-activated protein (MAP) kinase cascade Ylssk2, Ylpbs2, and Ylhog1 Together, these results demonstrate that Y. lipolytica transitions to hyphal growth in response to stress through multiple signaling pathways.IMPORTANCE Many yeasts undergo a morphological transition from yeast-to-hyphal growth in response to environmental conditions. We used forward and reverse genetic techniques to identify genes regulating this transition in Yarrowia lipolytica We confirmed that the transcription factor Ylmsn2 is required for the transition to hyphal growth and found that signaling by the histidine kinases Ylchk1 and Ylnik1 as well as the MAP kinases of the HOG pathway (Ylssk2, Ylpbs2, and Ylhog1) regulates the transition to hyphal growth. These results suggest that Y. lipolytica transitions to hyphal growth in response to stress through multiple kinase pathways. Intriguingly, we found that a repetitive portion of the genome containing telomere-like and rDNA repeats may be involved in the transition to hyphal growth, suggesting a link between this region and the general stress response. Copyright © 2018 Pomraning et al.


September 22, 2019

Detection and visualization of complex structural variants from long reads.

With applications in cancer, drug metabolism, and disease etiology, understanding structural variation in the human genome is critical in advancing the thrusts of individualized medicine. However, structural variants (SVs) remain challenging to detect with high sensitivity using short read sequencing technologies. This problem is exacerbated when considering complex SVs comprised of multiple overlapping or nested rearrangements. Longer reads, such as those from Pacific Biosciences platforms, often span multiple breakpoints of such events, and thus provide a way to unravel small-scale complexities in SVs with higher confidence.We present CORGi (COmplex Rearrangement detection with Graph-search), a method for the detection and visualization of complex local genomic rearrangements. This method leverages the ability of long reads to span multiple breakpoints to untangle SVs that appear very complicated with respect to a reference genome. We validated our approach against both simulated long reads, and real data from two long read sequencing technologies. We demonstrate the ability of our method to identify breakpoints inserted in synthetic data with high accuracy, and the ability to detect and plot SVs from NA12878 germline, achieving 88.4% concordance between the two sets of sequence data. The patterns of complexity we find in many NA12878 SVs match known mechanisms associated with DNA replication and structural variant formation, and highlight the ability of our method to automatically label complex SVs with an intuitive combination of adjacent or overlapping reference transformations.CORGi is a method for interrogating genomic regions suspected to contain local rearrangements using long reads. Using pairwise alignments and graph search CORGi produces labels and visualizations for local SVs of arbitrary complexity.


September 22, 2019

Approaches for surveying cosmic radiation damage in large populations of Arabidopsis thaliana seeds-Antarctic balloons and particle beams.

The Cosmic Ray Exposure Sequencing Science (CRESS) payload system is a proof of concept experiment to assess the genomic impact of space radiation on seeds. CRESS was designed as a secondary payload for the December 2016 high-altitude, high-latitude, and long-duration balloon flight carrying the Boron And Carbon Cosmic Rays in the Upper Stratosphere (BACCUS) experimental hardware. Investigation of the biological effects of Galactic Cosmic Radiation (GCR), particularly those of ions with High-Z and Energy (HZE), is of interest due to the genomic damage this type of radiation inflicts. The biological effects of upper-stratospheric mixed radiation above Antarctica (ANT) were sampled using Arabidopsis thaliana seeds and were compared to those resulting from a controlled simulation of GCR at Brookhaven National Laboratory (BNL) and to laboratory control seed. The payload developed for Antarctica exposure was broadly designed to 1U CubeSat specifications (10cmx10cmx10cm, =1.33kg), maintained 1 atm internal pressure, and carried an internal cargo of four seed trays (about 580,000 seeds) and twelve CR-39 Solid-State Nuclear Track Detectors (SSNTDs). The irradiated seeds were recovered, sterilized and grown on Petri plates for phenotypic screening. BNL and ANT M0 seeds showed significantly reduced germination rates and elevated somatic mutation rates when compared to non-irradiated controls, with the BNL mutation rate also being significantly higher than that of ANT. Genomic DNA from mutants of interest was evaluated with whole-genome sequencing using PacBio SMRT technology. Sequence data revealed the presence of an array of genome structural variants in the genomes of M0 and M1 mutant plants.


September 22, 2019

CompStor Novos: a low cost yet fast assembly-based variant calling for personal genomes

Application of assembly methods for personal genome analysis from next generation sequencing data has been limited by the requirement for an expensive supercomputer hardware or long computation times when using ordinary resources. We describe CompStor Novos, achieving supercomputer-class performance in de novo assembly computation time on standard server hardware, based on a tiered-memory algorithm. Run on commercial off-the-shelf servers, Novos assembly is more precise and 10-20 times faster than that of existing assembly algorithms. Furthermore, we integrated Novos into a variant calling pipeline and demonstrate that both compute times and precision of calling point variants and indels compare well with standard alignment-based pipelines. Additionally, assembly eliminates bias in the estimation of allele frequency for indels and naturally enables discovery of breakpoints for structural variants with base pair resolution. Thus, Novos bridges the gap between alignment-based and assembly-based genome analyses. Extension and adaption of its underlying algorithm will help quickly and fully harvest information in sequencing reads for personal genome reconstruction.


September 21, 2019

Discovery and genotyping of structural variation from long-read haploid genome sequence data.

In an effort to more fully understand the full spectrum of human genetic variation, we generated deep single-molecule, real-time (SMRT) sequencing data from two haploid human genomes. By using an assembly-based approach (SMRT-SV), we systematically assessed each genome independently for structural variants (SVs) and indels resolving the sequence structure of 461,553 genetic variants from 2 bp to 28 kbp in length. We find that >89% of these variants have been missed as part of analysis of the 1000 Genomes Project even after adjusting for more common variants (MAF > 1%). We estimate that this theoretical human diploid differs by as much as ~16 Mbp with respect to the human reference, with long-read sequencing data providing a fivefold increase in sensitivity for genetic variants ranging in size from 7 bp to 1 kbp compared with short-read sequence data. Although a large fraction of genetic variants were not detected by short-read approaches, once the alternate allele is sequence-resolved, we show that 61% of SVs can be genotyped in short-read sequence data sets with high accuracy. Uncoupling discovery from genotyping thus allows for the majority of this missed common variation to be genotyped in the human population. Interestingly, when we repeat SV detection on a pseudodiploid genome constructed in silico by merging the two haploids, we find that ~59% of the heterozygous SVs are no longer detected by SMRT-SV. These results indicate that haploid resolution of long-read sequencing data will significantly increase sensitivity of SV detection.© 2017 Huddleston et al.; Published by Cold Spring Harbor Laboratory Press.


September 21, 2019

Long-read genome sequencing identifies causal structural variation in a Mendelian disease.

PurposeCurrent clinical genomics assays primarily utilize short-read sequencing (SRS), but SRS has limited ability to evaluate repetitive regions and structural variants. Long-read sequencing (LRS) has complementary strengths, and we aimed to determine whether LRS could offer a means to identify overlooked genetic variation in patients undiagnosed by SRS.MethodsWe performed low-coverage genome LRS to identify structural variants in a patient who presented with multiple neoplasia and cardiac myxomata, in whom the results of targeted clinical testing and genome SRS were negative.ResultsThis LRS approach yielded 6,971 deletions and 6,821 insertions?>?50?bp. Filtering for variants that are absent in an unrelated control and overlap a disease gene coding exon identified three deletions and three insertions. One of these, a heterozygous 2,184?bp deletion, overlaps the first coding exon of PRKAR1A, which is implicated in autosomal dominant Carney complex. RNA sequencing demonstrated decreased PRKAR1A expression. The deletion was classified as pathogenic based on guidelines for interpretation of sequence variants.ConclusionThis first successful application of genome LRS to identify a pathogenic variant in a patient suggests that LRS has significant potential for the identification of disease-causing structural variation. Larger studies will ultimately be required to evaluate the potential clinical utility of LRS.


September 21, 2019

Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements.

CRISPR-Cas9 is poised to become the gene editing tool of choice in clinical contexts. Thus far, exploration of Cas9-induced genetic alterations has been limited to the immediate vicinity of the target site and distal off-target sequences, leading to the conclusion that CRISPR-Cas9 was reasonably specific. Here we report significant on-target mutagenesis, such as large deletions and more complex genomic rearrangements at the targeted sites in mouse embryonic stem cells, mouse hematopoietic progenitors and a human differentiated cell line. Using long-read sequencing and long-range PCR genotyping, we show that DNA breaks introduced by single-guide RNA/Cas9 frequently resolved into deletions extending over many kilobases. Furthermore, lesions distal to the cut site and crossover events were identified. The observed genomic damage in mitotically active cells caused by CRISPR-Cas9 editing may have pathogenic consequences.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.