Short tandem repeat (STR) expansions have been identified as the causal DNA mutation in dozens of Mendelian diseases. Most existing tools for detecting STR variation with short reads do so within the read length and so are unable to detect the majority of pathogenic expansions. Here we present STRetch, a new genome-wide method to scan for STR expansions at all loci across the human genome. We demonstrate the use of STRetch for detecting STR expansions using short-read whole-genome sequencing data at known pathogenic loci as well as novel STR loci. STRetch is open source software, available from github.com/Oshlack/STRetch .
The human X and Y chromosomes are heteromorphic but share a region of homology at the tips of their short arms, pseudoautosomal region 1 (PAR1), that supports obligate crossover in male meiosis. Although the boundary between pseudoautosomal and sex-specific DNA has traditionally been regarded as conserved among primates, it was recently discovered that the boundary position varies among human males, due to a translocation of ~110 kb from the X to the Y chromosome that creates an extended PAR1 (ePAR). This event has occurred at least twice in human evolution. So far, only limited evidence has been presented to suggest…
Acquired genomic structural variants (SVs) are major hallmarks of cancer genomes, but they are challenging to reconstruct from short-read sequencing data. Here we exploited the long reads of the nanopore platform using our customized pipeline, Picky ( https://github.com/TheJacksonLaboratory/Picky ), to reveal SVs of diverse architecture in a breast cancer model. We identified the full spectrum of SVs with superior specificity and sensitivity relative to short-read analyses, and uncovered repetitive DNA as the major source of variation. Examination of genome-wide breakpoints at nucleotide resolution uncovered micro-insertions as the common structural features associated with SVs. Breakpoint density across the genome is associated…
The killer-cell immunoglobulin-like receptor (KIR) genes regulate natural killer cell activity, influencing predisposition to immune mediated disease, and affecting hematopoietic stem cell transplantation (HSCT) outcome. Owing to the complexity of the KIR locus, with extensive gene copy number variation (CNV) and allelic diversity, high-resolution characterization of KIR has so far been applied only to relatively small cohorts. Here, we present a comprehensive high-throughput KIR genotyping approach based on next generation sequencing. Through PCR amplification of specific exons, our approach delivers both copy numbers of the individual genes and allelic information for every KIR gene. Ten-fold replicate analysis of a set…
This tutorial provides an overview of the Long Amplicon Analysis (LAA) application. The LAA algorithm generates highly accurate, phased and full-length consensus sequences from long amplicons. Applications of LAA include HLA typing, alternative haplotyping, and localized de novo assemblies of targeted genes.
This tutorial provides an overview of the Isoform Sequencing (Iso-Seq) analysis application. The Iso-Seq application provides reads that span entire transcript isoforms, from the 5′ end to the 3′ polyA-tail. Generation of accurate, full-length transcript sequences greatly simplifies analysis by eliminating the need for transcript reconstruction to infer isoforms using error-prone assembly of short RNA-seq reads.
A team of scientists has published one of the most detailed explorations to date of complex structural variation in a human genome. The results highlight just how much genomic variation is missed when working exclusively with short-read sequencing technologies.
Pacific Biosciences is making advances in the targeted sequencing space, including a partnership with Roche NimbleGen.
The approach should allow “haplotype phas[ing] and assembly of complex regions even in genomic regions containing complex repeats or PCR-challenged sequences that limit the performance of other synthetic long read approaches based on short read sequencing technologies”.
These solutions will combine the power of RainDance’s proprietary digital droplet technology and single-molecule barcoding capabilities with Pacific Biosciences’ proprietary long-range DNA amplification technology to provide sample preparation upstream of PacBio’s long-read sequencing system.
Researchers from Baylor College of Medicine reported in BMC Genomics how they employed multiple sequencing technologies, library preparations, assembly methods, and genome mapping tools to work toward creating a reference diploid genome long read data from PacBio was especially important.
In case you’re not aware, the human genome is not completely sequenced. Regions of repeats and hard to decode sequence continue to elude scientists. And then there is the challenge of genetic variation. Nathan Blow looks at new technologies and methods that could help map these and other difficult-to-read stretches of DNA.
Genome editing researchers based at Stanford and Emory Universities have developed a method for tracking the outcome of editing experiments using SMRT Sequencing.