September 22, 2019  |  

A survey of localized sequence rearrangements in human DNA.

Genomes mutate and evolve in ways simple (substitution or deletion of bases) and complex (e.g. chromosome shattering). We do not fully understand what types of complex mutation occur, and we cannot routinely characterize arbitrarily-complex mutations in a high-throughput, genome-wide manner. Long-read DNA sequencing methods (e.g. PacBio, nanopore) are promising for this task, because one read may encompass a whole complex mutation. We describe an analysis pipeline to characterize arbitrarily-complex ‘local’ mutations, i.e. intrachromosomal mutations encompassed by one DNA read. We apply it to nanopore and PacBio reads from one human cell line (NA12878), and survey sequence rearrangements, both real and artifactual. Almost all the real rearrangements belong to recurring patterns or motifs: the most common is tandem multiplication (e.g. heptuplication), but there are also complex patterns such as localized shattering, which resembles DNA damage by radiation. Gene conversions are identified, including one between hemoglobin gamma genes. This study demonstrates a way to find intricate rearrangements with any number of duplications, deletions, and repositionings. It demonstrates a probability-based method to resolve ambiguous rearrangements involving highly similar sequences, as occurs in gene conversion. We present a catalog of local rearrangements in one human cell line, and show which rearrangement patterns occur.


September 22, 2019  |  

De novo assembly and phasing of dikaryotic genomes from two isolates of Puccinia coronata f. sp. avenae, the causal agent of oat crown rust.

Oat crown rust, caused by the fungus Pucinnia coronata f. sp. avenae, is a devastating disease that impacts worldwide oat production. For much of its life cycle, P. coronata f. sp. avenae is dikaryotic, with two separate haploid nuclei that may vary in virulence genotype, highlighting the importance of understanding haplotype diversity in this species. We generated highly contiguous de novo genome assemblies of two P. coronata f. sp. avenae isolates, 12SD80 and 12NC29, from long-read sequences. In total, we assembled 603 primary contigs for 12SD80, for a total assembly length of 99.16 Mbp, and 777 primary contigs for 12NC29, for a total length of 105.25 Mbp; approximately 52% of each genome was assembled into alternate haplotypes. This revealed structural variation between haplotypes in each isolate equivalent to more than 2% of the genome size, in addition to about 260,000 and 380,000 heterozygous single-nucleotide polymorphisms in 12SD80 and 12NC29, respectively. Transcript-based annotation identified 26,796 and 28,801 coding sequences for isolates 12SD80 and 12NC29, respectively, including about 7,000 allele pairs in haplotype-phased regions. Furthermore, expression profiling revealed clusters of coexpressed secreted effector candidates, and the majority of orthologous effectors between isolates showed conservation of expression patterns. However, a small subset of orthologs showed divergence in expression, which may contribute to differences in virulence between 12SD80 and 12NC29. This study provides the first haplotype-phased reference genome for a dikaryotic rust fungus as a foundation for future studies into virulence mechanisms in P. coronata f. sp. avenaeIMPORTANCE Disease management strategies for oat crown rust are challenged by the rapid evolution of Puccinia coronata f. sp. avenae, which renders resistance genes in oat varieties ineffective. Despite the economic importance of understanding P. coronata f. sp. avenae, resources to study the molecular mechanisms underpinning pathogenicity and the emergence of new virulence traits are lacking. Such limitations are partly due to the obligate biotrophic lifestyle of P. coronata f. sp. avenae as well as the dikaryotic nature of the genome, features that are also shared with other important rust pathogens. This study reports the first release of a haplotype-phased genome assembly for a dikaryotic fungal species and demonstrates the amenability of using emerging technologies to investigate genetic diversity in populations of P. coronata f. sp. avenae. Copyright © 2018 Miller et al.


July 19, 2019  |  

Detecting AGG interruptions in male and female FMR1 premutation carriers by single-molecule sequencing.

The FMR1 gene contains an unstable CGG repeat in its 5′ untranslated region. Premutation alleles range between 55 and 200 repeat units and confer a risk for developing fragile X-associated tremor/ataxia syndrome or fragile X-associated primary ovarian insufficiency. Furthermore, the premutation allele often expands to a full mutation during female germline transmission giving rise to the fragile X syndrome. The risk for a premutation to expand depends mainly on the number of CGG units and the presence of AGG interruptions in the CGG repeat. Unfortunately, the detection of AGG interruptions is hampered by technical difficulties. Here, we demonstrate that single-molecule sequencing enables the determination of not only the repeat size, but also the complete repeat sequence including AGG interruptions in male and female alleles with repeats ranging from 45 to 100 CGG units. We envision this method will facilitate research and diagnostic analysis of the FMR1 repeat expansion. © 2016 WILEY PERIODICALS, INC.


July 19, 2019  |  

Mapping the landscape of tandem repeat variability by targeted long read single molecule sequencing in familial X-linked intellectual disability.

The etiology of more than half of all patients with X-linked intellectual disability remains elusive, despite array-based comparative genomic hybridization, whole exome or genome sequencing. Since short read massive parallel sequencing approaches do not allow the detection of larger tandem repeat expansions, we hypothesized that such expansions could be a hidden cause of X-linked intellectual disability.We selectively captured over 1800 tandem repeats on the X chromosome and characterized them by long read single molecule sequencing in 3 families with idiopathic X-linked intellectual disability. In male DNA samples, full tandem repeat length sequences were obtained for 88-93% of the targets and up to 99.6% of the repeats with a moderate guanine-cytosine content. Read length and analysis pipeline allow to detect cases of >?900?bp tandem repeat expansion. In one family, one repeat expansion co-occurs with down-regulation of the neighboring MIR222 gene. This gene has previously been implicated in intellectual disability and is apparently linked to FMR1 and NEFH overexpression associated with neurological disorders.This study demonstrates the power of single molecule sequencing to measure tandem repeat lengths and detect expansions, and suggests that tandem repeat mutations may be a hidden cause of X-linked intellectual disability.


July 7, 2019  |  

Exploring possible DNA structures in real-time polymerase kinetics using Pacific Biosciences sequencer data.

BackgroundPausing of DNA polymerase can indicate the presence of a DNA structure that differs from the canonical double-helix. Here we detail a method to investigate how polymerase pausing in the Pacific Biosciences sequencer reads can be related to DNA sequences. The Pacific Biosciences sequencer uses optics to view a polymerase and its interaction with a single DNA molecule in real-time, offering a unique way to detect potential alternative DNA structures.ResultsWe have developed a new way to examine polymerase kinetics data and relate it to the DNA sequence by using a wavelet transform of read information from the sequencer. We use this method to examine how polymerase kinetics are related to nucleotide base composition. We then examine tandem repeat sequences known for their ability to form different DNA structures: (CGG)n and (CG)n repeats which can, respectively, form G-quadruplex DNA and Z-DNA. We find pausing around the (CGG)n repeat that may indicate the presence of G-quadruplexes in some of the sequencer reads. The (CG)n repeat does not appear to cause polymerase pausing, but its kinetics signature nevertheless suggests the possibility that alternative nucleotide conformations may sometimes be present.ConclusionWe discuss the implications of using our method to discover DNA sequences capable of forming alternative structures. The analyses presented here can be reproduced on any Pacific Biosciences kinetics data for any DNA pattern of interest using an R package that we have made publicly available.


July 7, 2019  |  

STRetch: detecting and discovering pathogenic short tandem repeat expansions.

Short tandem repeat (STR) expansions have been identified as the causal DNA mutation in dozens of Mendelian diseases. Most existing tools for detecting STR variation with short reads do so within the read length and so are unable to detect the majority of pathogenic expansions. Here we present STRetch, a new genome-wide method to scan for STR expansions at all loci across the human genome. We demonstrate the use of STRetch for detecting STR expansions using short-read whole-genome sequencing data at known pathogenic loci as well as novel STR loci. STRetch is open source software, available from github.com/Oshlack/STRetch .


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.