Menu
April 21, 2020  |  

Copy-number variants in clinical genome sequencing: deployment and interpretation for rare and undiagnosed disease.

Current diagnostic testing for genetic disorders involves serial use of specialized assays spanning multiple technologies. In principle, genome sequencing (GS) can detect all genomic pathogenic variant types on a single platform. Here we evaluate copy-number variant (CNV) calling as part of a clinically accredited GS test.We performed analytical validation of CNV calling on 17 reference samples, compared the sensitivity of GS-based variants with those from a clinical microarray, and set a bound on precision using orthogonal technologies. We developed a protocol for family-based analysis of GS-based CNV calls, and deployed this across a clinical cohort of 79 rare and undiagnosed cases.We found that CNV calls from GS are at least as sensitive as those from microarrays, while only creating a modest increase in the number of variants interpreted (~10 CNVs per case). We identified clinically significant CNVs in 15% of the first 79 cases analyzed, all of which were confirmed by an orthogonal approach. The pipeline also enabled discovery of a uniparental disomy (UPD) and a 50% mosaic trisomy 14. Directed analysis of select CNVs enabled breakpoint level resolution of genomic rearrangements and phasing of de novo CNVs.Robust identification of CNVs by GS is possible within a clinical testing environment.


April 21, 2020  |  

Long-read assembly of the Chinese rhesus macaque genome and identification of ape-specific structural variants.

We present a high-quality de novo genome assembly (rheMacS) of the Chinese rhesus macaque (Macaca mulatta) using long-read sequencing and multiplatform scaffolding approaches. Compared to the current Indian rhesus macaque reference genome (rheMac8), rheMacS increases sequence contiguity 75-fold, closing 21,940 of the remaining assembly gaps (60.8 Mbp). We improve gene annotation by generating more than two million full-length transcripts from ten different tissues by long-read RNA sequencing. We sequence resolve 53,916 structural variants (96% novel) and identify 17,000 ape-specific structural variants (ASSVs) based on comparison to ape genomes. Many ASSVs map within ChIP-seq predicted enhancer regions where apes and macaque show diverged enhancer activity and gene expression. We further characterize a subset that may contribute to ape- or great-ape-specific phenotypic traits, including taillessness, brain volume expansion, improved manual dexterity, and large body size. The rheMacS genome assembly serves as an ideal reference for future biomedical and evolutionary studies.


April 21, 2020  |  

Systematic analysis of dark and camouflaged genes reveals disease-relevant genes hiding in plain sight.

The human genome contains “dark” gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations within these gene regions that may be relevant to human disease. Here, we identify regions with few mappable reads that we call dark by depth, and others that have ambiguous alignment, called camouflaged. We assess how well long-read or linked-read technologies resolve these regions.Based on standard whole-genome Illumina sequencing data, we identify 36,794 dark regions in 6054 gene bodies from pathways important to human health, development, and reproduction. Of these gene bodies, 8.7% are completely dark and 35.2% are =?5% dark. We identify dark regions that are present in protein-coding exons across 748 genes. Linked-read or long-read sequencing technologies from 10x Genomics, PacBio, and Oxford Nanopore Technologies reduce dark protein-coding regions to approximately 50.5%, 35.6%, and 9.6%, respectively. We present an algorithm to resolve most camouflaged regions and apply it to the Alzheimer’s Disease Sequencing Project. We rescue a rare ten-nucleotide frameshift deletion in CR1, a top Alzheimer’s disease gene, found in disease cases but not in controls.While we could not formally assess the association of the CR1 frameshift mutation with Alzheimer’s disease due to insufficient sample-size, we believe it merits investigating in a larger cohort. There remain thousands of potentially important genomic regions overlooked by short-read sequencing that are largely resolved by long-read technologies.


September 22, 2019  |  

Searching for convergent pathways in autism spectrum disorders: insights from human brain transcriptome studies.

Autism spectrum disorder (ASD) is one of the most heritable neuropsychiatric conditions. The complex genetic landscape of the disorder includes both common and rare variants at hundreds of genetic loci. This marked heterogeneity has thus far hampered efforts to develop genetic diagnostic panels and targeted pharmacological therapies. Here, we give an overview of the current literature on the genetic basis of ASD, and review recent human brain transcriptome studies and their role in identifying convergent pathways downstream of the heterogeneous genetic variants. We also discuss emerging evidence on the involvement of non-coding genomic regions and non-coding RNAs in ASD.


September 22, 2019  |  

Cow, yak, and camel milk diets differentially modulated the systemic immunity and fecal microbiota of rats

Cow milk is most widely consumed; however, non-cattle milk has gained increasing interest because of added nutritive values. We compared the health effects of yak, cow, and camel milk in rats. By measuring several plasma immune factors, significantly more interferon-? was detected in the camel than the yak (P=0.0020) or cow (P=0.0062) milk group. Significantly more IgM was detected in the yak milk than the control group (P=0.0071). The control group had significantly less interleukin 6 than the yak (P=0.0499) and cow (P=0.0248) milk groups. The fecal microbiota of the 144 samples comprised mainly of the Firmicutes (76.70±11.03%), Bacteroidetes (15.27±7.79%), Proteobacteria (3.61±4.34%), and Tenericutes (2.61±2.53%) phyla. Multivariate analyses revealed a mild shift in the fecal microbiota along the milk treatment. We further identified the differential microbes across the four groups. At day 14, 22 and 28 differential genera and species were identified (P=0.0000–0.0462), while 8 and 11 differential genera and species (P=0.0000–0.0013) were found at day 28. Some short-chain fatty acid and succinate producers increased, while certain health-concerned bacteria (Prevotella copri, Phascolarctobacterium faecium, and Bacteroides uniformis) decreased after 14days of yak or camel milk treatment. We demonstrated that different animal milk could confer distinctive nutritive value to the host.


September 22, 2019  |  

Effects of metal and metalloid pollutants on the microbiota composition of feces obtained from twelve commercial pig farms across China.

Understanding the metal and metalloid contamination and microbiota composition of pig feces is an important step required to support the design and implementation of effective pollution control and prevention strategies. A survey was implemented in 12 locations across China to investigate the content of metals and metalloids, and the main composition of the microbial communities of commercially reared pigs during two growth periods, defined as the early (Q group) and the later fattening growth phases (H group). These data showed widespread Al, Mn, Cu, Zn, and Fe pollution in pig feces. The concentration of Zn in the Q group feces was nearly two times higher than the levels measured in the H group. The microbial composition of the Q group exhibited greater richness of operational taxonomic units (OTUs) and fewer bacteria associated with zoonotic diseases compared with the microbial composition of the H group. Spearman rank correlation analysis showed that Cu and northern latitudes had a significant positive effect on the richness of bacterial communities in pig feces. Zn and Cd exhibited the biggest impact on microbial community composition based on canonical correspondence analysis. Functional metagenomic prediction indicated that about 0.8% genes present in the pig feces bacteria community are related to human diseases, and significantly more predicted pathogenic genes were detected in the H group than in the Q group. These results support the need to monitor heavy metal contamination and to control for zoonotic pathogens disseminated from pig feces in Chinese pig farms. Copyright © 2018. Published by Elsevier B.V.


September 22, 2019  |  

Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics.

Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio’s single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing.© The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.


September 22, 2019  |  

Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci.

We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.


September 22, 2019  |  

SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification.

High-throughput sequencing of full-length transcripts using long reads has paved the way for the discovery of thousands of novel transcripts, even in well-annotated mammalian species. The advances in sequencing technology have created a need for studies and tools that can characterize these novel variants. Here, we present SQANTI, an automated pipeline for the classification of long-read transcripts that can assess the quality of data and the preprocessing pipeline using 47 unique descriptors. We apply SQANTI to a neuronal mouse transcriptome using Pacific Biosciences (PacBio) long reads and illustrate how the tool is effective in characterizing and describing the composition of the full-length transcriptome. We perform extensive evaluation of ToFU PacBio transcripts by PCR to reveal that an important number of the novel transcripts are technical artifacts of the sequencing approach and that SQANTI quality descriptors can be used to engineer a filtering strategy to remove them. Most novel transcripts in this curated transcriptome are novel combinations of existing splice sites, resulting more frequently in novel ORFs than novel UTRs, and are enriched in both general metabolic and neural-specific functions. We show that these new transcripts have a major impact in the correct quantification of transcript levels by state-of-the-art short-read-based quantification algorithms. By comparing our iso-transcriptome with public proteomics databases, we find that alternative isoforms are elusive to proteogenomics detection. SQANTI allows the user to maximize the analytical outcome of long-read technologies by providing the tools to deliver quality-evaluated and curated full-length transcriptomes.© 2018 Tardaguila et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019  |  

Hybrid error correction and de novo assembly of single-molecule sequencing reads.

Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.


September 22, 2019  |  

Altered expression of the FMR1 splicing variants landscape in premutation carriers.

FMR1 premutation carriers (55-200 CGG repeats) are at risk for developing Fragile X-associated Tremor/Ataxia Syndrome (FXTAS), an adult onset neurodegenerative disorder. Approximately 20% of female carriers will develop Fragile X-associated Primary Ovarian Insufficiency (FXPOI), in addition to a number of clinical problems affecting premutation carriers throughout their life span. Marked elevation in FMR1 mRNA levels have been observed with premutation alleles resulting in RNA toxicity, the leading molecular mechanism proposed for the FMR1 associated disorders observed in premutation carriers. The FMR1 gene undergoes alternative splicing and we have recently reported that the relative abundance of all FMR1 mRNA isoforms is significantly increased in premutation carriers. In this study, we characterized the transcriptional FMR1 isoforms distribution pattern in different tissues and identified a total of 49 isoforms, some of which observed only in premutation carriers and which might play a role in the pathogenesis of FXTAS. Further, we investigated the distribution pattern and expression levels of the FMR1 isoforms in asymptomatic premutation carriers and in those with FXTAS and found no significant differences between the two groups. Our findings suggest that the characterization of the expression levels of the different FMR1 isoforms is fundamental for understanding the regulation of the FMR1 gene as imbalance in their expression could lead to an altered functional diversity with neurotoxic consequences. Their characterization will also help to elucidating the mechanism(s) by which “toxic gain of function” of the FMR1 mRNA may play a role in FXTAS and/or in the other FMR1-associated conditions. Copyright © 2017. Published by Elsevier B.V.


September 22, 2019  |  

Meeting report: 31st International Mammalian Genome Conference, Mammalian Genetics and Genomics: From Molecular Mechanisms to Translational Applications.

High on the Heidelberg hills, inside the Advanced Training Centre of the European Molecular Biology Laboratory (EMBL) campus with its unique double-helix staircase, scientists gathered for the EMBL conference “Mammalian Genetics and Genomics: From Molecular Mechanisms to Translational Applications,” organized in cooperation with the International Mammalian Genome Society (IMGS) and the Mouse Molecular Genetics (MMG) group. The conference attracted 205 participants from 30 countries, representing 6 of the 7 continents-all except Antarctica. It was a richly diverse group of geneticists, clinicians, and bioinformaticians, with presentations by established and junior investigators, including many trainees. From the 24th-27th of October 2017, they shared exciting advances in mammalian genetics and genomics research, from the introduction of cutting-edge technologies to descriptions of translational studies involving highly relevant models of human disease.


September 22, 2019  |  

Transcriptional fates of human-specific segmental duplications in brain.

Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth-death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits.© 2018 Dougherty et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019  |  

Analyses of intestinal microbiota: culture versus sequencing.

Analyzing human as well as animal microbiota composition has gained growing interest because structural components and metabolites of microorganisms fundamentally influence all aspects of host physiology. Originally dominated by culture-dependent methods for exploring these ecosystems, the development of molecular techniques such as high throughput sequencing has dramatically increased our knowledge. Because many studies of the microbiota are based on the bacterial 16S ribosomal RNA (rRNA) gene targets, they can, at least in principle, be compared to determine the role of the microbiome composition for developmental processes, host metabolism, and physiology as well as different diseases. In our review, we will summarize differences and pitfalls in current experimental protocols, including all steps from nucleic acid extraction to bioinformatical analysis which may produce variation that outweighs subtle biological differences. Future developments, such as integration of metabolomic, transcriptomic, and metagenomic data sets and standardization of the procedures, will be discussed. © The Author 2015. Published by Oxford University Press on behalf of the Institute for Laboratory Animal Research. All rights reserved. For permissions, please email: journals.permissions@oup.com.


September 22, 2019  |  

The human microbiome and understanding the 16S rRNA gene in translational nursing science.

As more is understood regarding the human microbiome, it is increasingly important for nurse scientists and healthcare practitioners to analyze these microbial communities and their role in health and disease. 16S rRNA sequencing is a key methodology in identifying these bacterial populations that has recently transitioned from use primarily in research to having increased utility in clinical settings.The objectives of this review are to (a) describe 16S rRNA sequencing and its role in answering research questions important to nursing science; (b) provide an overview of the oral, lung, and gut microbiomes and relevant research; and (c) identify future implications for microbiome research and 16S sequencing in translational nursing science.Sequencing using the 16S rRNA gene has revolutionized research and allowed scientists to easily and reliably characterize complex bacterial communities. This type of research has recently entered the clinical setting, one of the best examples involving the use of 16S sequencing to identify resistant pathogens, thereby improving the accuracy of bacterial identification in infection control. Clinical microbiota research and related requisite methods are of particular relevance to nurse scientists-individuals uniquely positioned to utilize these techniques in future studies in clinical settings.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.