Menu
July 19, 2019

Introduction: The host-associated microbiome: Pattern, process and function.

An explosion of studies in recent years has established the ubiquity of host-associated microbes and their centrality to host biology (McFall-Ngai et al., 2013; Russell, Dubilier, & Rudgers, 2014). Microbes aid in digestion, modulate development, contribute to host immunity, mediate abiotic stress and more. While relationships with host-associated microbes are ubiquitous and important, they are cer- tainly not monolithic. Characterizing the microbial diversity associ- ated with an ever-broadening array of hosts (diverse animals, plants, algae and protists) has shown that essential functions can be per- formed by microbes that are integrated with the host to varying degrees, ranging from embedded endosymbionts to a variable cast of transient microbes acquired from the environment. The maturing host–microbiome field is now developing a mechanistic understand- ing of host/microbe relationships across this spectrum and the cross- talk mediating these interactions. Similarly, studies across systems are illuminating the ecological and evolutionary factors that shape host–microbe interactions today and providing hints into the origins of specific relationships.


July 19, 2019

De novo repeat interruptions are associated with reduced somatic instability and mild or absent clinical features in myotonic dystrophy type 1.

Myotonic dystrophy type 1 (DM1) is a multisystem disorder, caused by expansion of a CTG trinucleotide repeat in the 3′-untranslated region of the DMPK gene. The repeat expansion is somatically unstable and tends to increase in length with time, contributing to disease progression. In some individuals, the repeat array is interrupted by variant repeats such as CCG and CGG, stabilising the expansion and often leading to milder symptoms. We have characterised three families, each including one person with variant repeats that had arisen de novo on paternal transmission of the repeat expansion. Two individuals were identified for screening due to an unusual result in the laboratory diagnostic test, and the third due to exceptionally mild symptoms. The presence of variant repeats in all three expanded alleles was confirmed by restriction digestion of small pool PCR products, and allele structures were determined by PacBio sequencing. Each was different, but all contained CCG repeats close to the 3′-end of the repeat expansion. All other family members had inherited pure CTG repeats. The variant repeat-containing alleles were more stable in the blood than pure alleles of similar length, which may in part account for the mild symptoms observed in all three individuals. This emphasises the importance of somatic instability as a disease mechanism in DM1. Further, since patients with variant repeats may have unusually mild symptoms, identification of these individuals has important implications for genetic counselling and for patient stratification in DM1 clinical trials.


July 19, 2019

A Borrelia burgdorferi mini-vls system that undergoes antigenic switching in mice: investigation of the role of plasmid topology and the long inverted repeat.

Borrelia burgdorferi evades the host immune system by switching the surface antigen. VlsE, in a process known as antigenic variation. The DNA mechanisms and genetic elements present on the vls locus that participate in the switching process remain to be elucidated. Manipulating the vls locus has been difficult due to its instability on Escherichia coli plasmids. In this study, we generated for the first time a mini-vls system composed of a single silent vlsE variable region (silent cassette 2) through the vlsE gene by performing some cloning steps directly in a highly transformable B. burgdorferi strain. Variants of the mini system were constructed with or without the long inverted repeat (IR) located upstream of vlsE and on both circular and linear plasmids to investigate the importance of the IR and plasmid topology on recombinational switching at vlsE. Amplicon sequencing using PacBio long read technology and analysis of the data with our recently reported pipeline and VAST software showed that the system undergoes switching in mice in both linear and circular versions and that the presence of the hairpin does not seem to be crucial in the linear version, however it is required when the topology is circular.© 2018 John Wiley & Sons Ltd.


July 19, 2019

Comparison between complete genomes of an isolate of Pseudomonas syringae pv. actinidiae from Japan and a New Zealand isolate of the pandemic.

The modern pandemic of the bacterial kiwifruit pathogen Pseudomonas syringae pv actinidiae (Psa) is caused by a particular Psa lineage. To better understand the genetic basis of the virulence of this lineage, we compare the completely assembled genome of a pandemic New Zealand strain with that of the Psa type strain first isolated in Japan in 1983. Aligning the two genomes shows numerous translocations, constrained so as to retain the appropriate orientation of the Architecture Imparting Sequences (AIMs). There are several large horizontally acquired regions, some of which include Type I, Type II or Type III restriction systems. The activity of these systems is reflected in the methylation patterns of the two strains. The pandemic strain carries an Integrative Conjugative Element (ICE) located at a tRNA-Lys site. Two other complex elements are also present at tRNA-Lys sites in the genome. These elements are derived from ICE but have now acquired some alternative secretion function. There are numerous types of mobile element in the two genomes. Analysis of these elements reveals no evidence of recombination between the two Psa lineages.


July 19, 2019

Adaptation and conservation insights from the koala genome.

The koala, the only extant species of the marsupial family Phascolarctidae, is classified as ‘vulnerable’ due to habitat loss and widespread disease. We sequenced the koala genome, producing a complete and contiguous marsupial reference genome, including centromeres. We reveal that the koala’s ability to detoxify eucalypt foliage may be due to expansions within a cytochrome P450 gene family, and its ability to smell, taste and moderate ingestion of plant secondary metabolites may be due to expansions in the vomeronasal and taste receptors. We characterized novel lactation proteins that protect young in the pouch and annotated immune genes important for response to chlamydial disease. Historical demography showed a substantial population crash coincident with the decline of Australian megafauna, while contemporary populations had biogeographic boundaries and increased inbreeding in populations affected by historic translocations. We identified genetically diverse populations that require habitat corridors and instituting of translocation programs to aid the koala’s survival in the wild.


July 19, 2019

Fern genomes elucidate land plant evolution and cyanobacterial symbioses.

Ferns are the closest sister group to all seed plants, yet little is known about their genomes other than that they are generally colossal. Here, we report on the genomes of Azolla filiculoides and Salvinia cucullata (Salviniales) and present evidence for episodic whole-genome duplication in ferns-one at the base of ‘core leptosporangiates’ and one specific to Azolla. One fern-specific gene that we identified, recently shown to confer high insect resistance, seems to have been derived from bacteria through horizontal gene transfer. Azolla coexists in a unique symbiosis with N2-fixing cyanobacteria, and we demonstrate a clear pattern of cospeciation between the two partners. Furthermore, the Azolla genome lacks genes that are common to arbuscular mycorrhizal and root nodule symbioses, and we identify several putative transporter genes specific to Azolla-cyanobacterial symbiosis. These genomic resources will help in exploring the biotechnological potential of Azolla and address fundamental questions in the evolution of plant life.


July 19, 2019

Identification and analysis of adenine N6-methylation sites in the rice genome.

DNA N6-methyladenine (6mA) is a non-canonical DNA modification that is present at low levels in different eukaryotes1-8, but its prevalence and genomic function in higher plants are unclear. Using mass spectrometry, immunoprecipitation and validation with analysis of single-molecule real-time sequencing, we observed that about 0.2% of all adenines are 6mA methylated in the rice genome. 6mA occurs most frequently at GAGG motifs and is mapped to about 20% of genes and 14% of transposable elements. In promoters, 6mA marks silent genes, but in bodies correlates with gene activity. 6mA overlaps with 5-methylcytosine (5mC) at CG sites in gene bodies and is complementary to 5mC at CHH sites in transposable elements. We show that OsALKBH1 may be potentially involved in 6mA demethylation in rice. The results suggest that 6mA is complementary to 5mC as an epigenomic mark in rice and reinforce a distinct role for 6mA as a gene expression-associated epigenomic mark in eukaryotes.


July 19, 2019

Complete genome sequences of extremely thermoacidophilic metal-mobilizing type strain members of the archaeal family Sulfolobaceae, Acidianus brierleyi DSM-1651, Acidianus sulfidivorans DSM-18786, and Metallosphaera hakonensis DSM-7519.

The family Sulfolobaceae contains extremely thermoacidophilic archaea that are found in terrestrial environments. Here, we report three closed genomes from two currently defined genera within the family, namely, Acidianus brierleyi DSM-1651T, Acidianus sulfidivorans DSM-18786T, and Metallosphaera hakonensis DSM-7519T.


July 19, 2019

Deep genome annotation of the opportunistic human pathogen Streptococcus pneumoniae D39.

A precise understanding of the genomic organization into transcriptional units and their regulation is essential for our comprehension of opportunistic human pathogens and how they cause disease. Using single-molecule real-time (PacBio) sequencing we unambiguously determined the genome sequence of Streptococcus pneumoniae strain D39 and revealed several inversions previously undetected by short-read sequencing. Significantly, a chromosomal inversion results in antigenic variation of PhtD, an important surface-exposed virulence factor. We generated a new genome annotation using automated tools, followed by manual curation, reflecting the current knowledge in the field. By combining sequence-driven terminator prediction, deep paired-end transcriptome sequencing and enrichment of primary transcripts by Cappable-Seq, we mapped 1015 transcriptional start sites and 748 termination sites. We show that the pneumococcal transcriptional landscape is complex and includes many secondary, antisense and internal promoters. Using this new genomic map, we identified several new small RNAs (sRNAs), RNA switches (including sixteen previously misidentified as sRNAs), and antisense RNAs. In total, we annotated 89 new protein-encoding genes, 34 sRNAs and 165 pseudogenes, bringing the S. pneumoniae D39 repertoire to 2146 genetic elements. We report operon structures and observed that 9% of operons are leaderless. The genome data are accessible in an online resource called PneumoBrowse (https://veeninglab.com/pneumobrowse) providing one of the most complete inventories of a bacterial genome to date. PneumoBrowse will accelerate pneumococcal research and the development of new prevention and treatment strategies.


July 19, 2019

How well can we create phased, diploid, human genomes?: An assessment of FALCON-Unzip phasing using a human trio

Long read sequencing technology has allowed researchers to create de novo assemblies with impressive continuity[1,2]. This advancement has dramatically increased the number of reference genomes available and hints at the possibility of a future where personal genomes are assembled rather than resequenced. In 2016 Pacific Biosciences released the FALCON-Unzip framework, which can provide long, phased haplotype contigs from de novo assemblies. This phased genome algorithm enhances the accuracy of highly heterozygous organisms and allows researchers to explore questions that require haplotype information such as allele-specific expression and regulation. However, validation of this technique has been limited to small genomes or inbred individuals[3]. As a roadmap to personal genome assembly and phasing, we assess the phasing accuracy of FALCON-Unzip in humans using publicly available data for the Ashkenazi trio from the Genome in a Bottle Consortium[4]. To assess the accuracy of the Unzip algorithm, we assembled the genome of the son using FALCON and FALCON Unzip, genotyped publicly available short read data for the mother and the father, and observed the inheritance pattern of the parental SNPs along the phased genome of the son. We found that 72.8% of haplotype contigs share SNPs with only one parent suggesting that these contigs are correctly phased. Most mis-phased SNPs are random but present in high frequency toward the end of haplotype contigs. Approximately 20.7% of mis-phased haplotype contigs contain clusters of mis-phased SNPs, suggesting that haplotypes were mis-joined by FALCON-Unzip. Mis-joined boundaries in those contigs are located in areas of low SNP density. This research demonstrates that the FALCON-Unzip algorithm can be used to create long and accurate haplotypes for humans and identifies problematic regions that could benefit in future improvement.


July 19, 2019

Long-read sequencing across the C9orf72 ‘GGGGCC’ repeat expansion: implications for clinical use and genetic discovery efforts in human disease.

Many neurodegenerative diseases are caused by nucleotide repeat expansions, but most expansions, like the C9orf72 ‘GGGGCC’ (G4C2) repeat that causes approximately 5-7% of all amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) cases, are too long to sequence using short-read sequencing technologies. It is unclear whether long-read sequencing technologies can traverse these long, challenging repeat expansions. Here, we demonstrate that two long-read sequencing technologies, Pacific Biosciences’ (PacBio) and Oxford Nanopore Technologies’ (ONT), can sequence through disease-causing repeats cloned into plasmids, including the FTD/ALS-causing G4C2 repeat expansion. We also report the first long-read sequencing data characterizing the C9orf72 G4C2 repeat expansion at the nucleotide level in two symptomatic expansion carriers using PacBio whole-genome sequencing and a no-amplification (No-Amp) targeted approach based on CRISPR/Cas9.Both the PacBio and ONT platforms successfully sequenced through the repeat expansions in plasmids. Throughput on the MinION was a challenge for whole-genome sequencing; we were unable to attain reads covering the human C9orf72 repeat expansion using 15 flow cells. We obtained 8× coverage across the C9orf72 locus using the PacBio Sequel, accurately reporting the unexpanded allele at eight repeats, and reading through the entire expansion with 1324 repeats (7941 nucleotides). Using the No-Amp targeted approach, we attained >?800× coverage and were able to identify the unexpanded allele, closely estimate expansion size, and assess nucleotide content in a single experiment. We estimate the individual’s repeat region was >?99% G4C2 content, though we cannot rule out small interruptions.Our findings indicate that long-read sequencing is well suited to characterizing known repeat expansions, and for discovering new disease-causing, disease-modifying, or risk-modifying repeat expansions that have gone undetected with conventional short-read sequencing. The PacBio No-Amp targeted approach may have future potential in clinical and genetic counseling environments. Larger and deeper long-read sequencing studies in C9orf72 expansion carriers will be important to determine heterogeneity and whether the repeats are interrupted by non-G4C2 content, potentially mitigating or modifying disease course or age of onset, as interruptions are known to do in other repeat-expansion disorders. These results have broad implications across all diseases where the genetic etiology remains unclear.


July 19, 2019

Characterization of a human-specific tandem repeat associated with bipolar disorder and schizophrenia.

Bipolar disorder (BD) and schizophrenia (SCZ) are highly heritable diseases that affect more than 3% of individuals worldwide. Genome-wide association studies have strongly and repeatedly linked risk for both of these neuropsychiatric diseases to a 100 kb interval in the third intron of the human calcium channel gene CACNA1C. However, the causative mutation is not yet known. We have identified a human-specific tandem repeat in this region that is composed of 30 bp units, often repeated hundreds of times. This large tandem repeat is unstable using standard polymerase chain reaction and bacterial cloning techniques, which may have resulted in its incorrect size in the human reference genome. The large 30-mer repeat region is polymorphic in both size and sequence in human populations. Particular sequence variants of the 30-mer are associated with risk status at several flanking single-nucleotide polymorphisms in the third intron of CACNA1C that have previously been linked to BD and SCZ. The tandem repeat arrays function as enhancers that increase reporter gene expression in a human neural progenitor cell line. Different human arrays vary in the magnitude of enhancer activity, and the 30-mer arrays associated with increased psychiatric disease risk status have decreased enhancer activity. Changes in the structure and sequence of these arrays likely contribute to changes in CACNA1C function during human evolution and may modulate neuropsychiatric disease risk in modern human populations. Copyright © 2018. Published by Elsevier Inc.


July 19, 2019

Accelerated ex situ breeding of GBSS- and PTST1-edited cassava for modified starch.

Crop diversification required to meet demands for food security and industrial use is often challenged by breeding time and amenability of varieties to genome modification. Cassava is one such crop. Grown for its large starch-rich storage roots, it serves as a staple food and a commodity in the multibillion-dollar starch industry. Starch is composed of the glucose polymers amylopectin and amylose, with the latter strongly influencing the physicochemical properties of starch during cooking and processing. We demonstrate that CRISPR-Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9)-mediated targeted mutagenesis of two genes involved in amylose biosynthesis, PROTEIN TARGETING TO STARCH (PTST1) or GRANULE BOUND STARCH SYNTHASE (GBSS), can reduce or eliminate amylose content in root starch. Integration of the Arabidopsis FLOWERING LOCUS T gene in the genome-editing cassette allowed us to accelerate flowering-an event seldom seen under glasshouse conditions. Germinated seeds yielded S1, a transgene-free progeny that inherited edited genes. This attractive new plant breeding technique for modified cassava could be extended to other crops to provide a suite of novel varieties with useful traits for food and industrial applications.


July 19, 2019

From short reads to chromosome-scale genome assemblies.

A high-quality, annotated genome assembly is the foundation for many downstream studies. However, obtaining such an assembly is a complex, reiterative process that requires the assimilation of high-quality data and combines different approaches and data types. While some software packages incorporating multiple steps of genome assembly are commercially available, they may not be flexible enough to be routinely applied to all organisms, particularly to nonmodel species such as pathogenic oomycetes and fungi. If researchers understand and apply the most appropriate, currently available tools for each step, it is possible to customize parameters and optimize results for their organism of study. Based on our experience of de novo assembly and annotation of several oomycete species, this chapter provides a modular workflow from processing of raw reads, to initial assembly generation, through optimization, chromosome-scale scaffolding and annotation, outlining input and output data as well as examples and alternative software used for each step. The accompanying Notes provide background information for each step as well as alternative options. The final result of this workflow could be an annotated, high-quality, validated, chromosome-scale assembly or a draft assembly of sufficient quality to meet specific needs of a project.


July 19, 2019

Genome organization and DNA accessibility control antigenic variation in trypanosomes.

Many evolutionarily distant pathogenic organisms have evolved similar survival strategies to evade the immune responses of their hosts. These include antigenic variation, through which an infecting organism prevents clearance by periodically altering the identity of proteins that are visible to the immune system of the host1. Antigenic variation requires large reservoirs of immunologically diverse antigen genes, which are often generated through homologous recombination, as well as mechanisms to ensure the expression of one or very few antigens at any given time. Both homologous recombination and gene expression are affected by three-dimensional genome architecture and local DNA accessibility2,3. Factors that link three-dimensional genome architecture, local chromatin conformation and antigenic variation have, to our knowledge, not yet been identified in any organism. One of the major obstacles to studying the role of genome architecture in antigenic variation has been the highly repetitive nature and heterozygosity of antigen-gene arrays, which has precluded complete genome assembly in many pathogens. Here we report the de novo haplotype-specific assembly and scaffolding of the long antigen-gene arrays of the model protozoan parasite Trypanosoma brucei, using long-read sequencing technology and conserved features of chromosome folding4. Genome-wide chromosome conformation capture (Hi-C) reveals a distinct partitioning of the genome, with antigen-encoding subtelomeric regions that are folded into distinct, highly compact compartments. In addition, we performed a range of analyses-Hi-C, fluorescence in situ hybridization, assays for transposase-accessible chromatin using sequencing and single-cell RNA sequencing-that showed that deletion of the histone variants H3.V and H4.V increases antigen-gene clustering, DNA accessibility across sites of antigen expression and switching of the expressed antigen isoform, via homologous recombination. Our analyses identify histone variants as a molecular link between global genome architecture, local chromatin conformation and antigenic variation.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.