Menu
April 21, 2020

The genome of Peromyscus leucopus, natural host for Lyme disease and other emerging infections.

The rodent Peromyscus leucopus is the natural reservoir of several tick-borne infections, including Lyme disease. To expand the knowledge base for this key species in life cycles of several pathogens, we assembled and scaffolded the P. leucopus genome. The resulting assembly was 2.45 Gb in total length, with 24 chromosome-length scaffolds harboring 97% of predicted genes. RNA sequencing following infection of P. leucopus with Borreliella burgdorferi, a Lyme disease agent, shows that, unlike blood, the skin is actively responding to the infection after several weeks. P. leucopus has a high level of segregating nucleotide variation, suggesting that natural resistance alleles to Crispr gene targeting constructs are likely segregating in wild populations. The reference genome will allow for experiments aimed at elucidating the mechanisms by which this widely distributed rodent serves as natural reservoir for several infectious diseases of public health importance, potentially enabling intervention strategies.


April 21, 2020

De novo assembly of the goldfish (Carassius auratus) genome and the evolution of genes after whole-genome duplication.

For over a thousand years, the common goldfish (Carassius auratus) was raised throughout Asia for food and as an ornamental pet. As a very close relative of the common carp (Cyprinus carpio), goldfish share the recent genome duplication that occurred approximately 14 million years ago in their common ancestor. The combination of centuries of breeding and a wide array of interesting body morphologies provides an exciting opportunity to link genotype to phenotype and to understand the dynamics of genome evolution and speciation. We generated a high-quality draft sequence and gene annotations of a “Wakin” goldfish using 71X PacBio long reads. The two subgenomes in goldfish retained extensive synteny and collinearity between goldfish and zebrafish. However, genes were lost quickly after the carp whole-genome duplication, and the expression of 30% of the retained duplicated gene diverged substantially across seven tissues sampled. Loss of sequence identity and/or exons determined the divergence of the expression levels across all tissues, while loss of conserved noncoding elements determined expression variance between different tissues. This assembly provides an important resource for comparative genomics and understanding the causes of goldfish variants.


April 21, 2020

The comparative genomics and complex population history of Papio baboons.

Recent studies suggest that closely related species can accumulate substantial genetic and phenotypic differences despite ongoing gene flow, thus challenging traditional ideas regarding the genetics of speciation. Baboons (genus Papio) are Old World monkeys consisting of six readily distinguishable species. Baboon species hybridize in the wild, and prior data imply a complex history of differentiation and introgression. We produced a reference genome assembly for the olive baboon (Papio anubis) and whole-genome sequence data for all six extant species. We document multiple episodes of admixture and introgression during the radiation of Papio baboons, thus demonstrating their value as a model of complex evolutionary divergence, hybridization, and reticulation. These results help inform our understanding of similar cases, including modern humans, Neanderthals, Denisovans, and other ancient hominins.


April 21, 2020

Decreased metabolism and increased tolerance to extreme environments in Staphylococcus warneri during long-term spaceflight.

Many studies have shown that the space environment can affect bacteria by causing a range of mutations. However, to date, few studies have explored the effects of long-term spaceflight (>1 month) on bacteria. In this study, a Staphylococcus warneri strain that was isolated from the Shenzhou-10 spacecraft and had experienced a spaceflight (15 days) was carried into space again. After a 64-day flight, combined phenotypic, genomic, transcriptomic, and proteomic analyses were performed to compare the influence of the two spaceflights on this bacterium. Compared with short-term spaceflight, long-term spaceflight increased the biofilm formation ability of S. warneri and the cell wall resistance to external environmental stress but reduced the sensitivity to chemical stimulation. Further analysis showed that these changes might be associated with the significantly upregulated gene expression of the phosphotransferase system, which regulates the metabolism of sugars, including glucose, mannose, fructose, and cellobiose. The mutation of S. warneri caused by the 15-day spaceflight was limited at the phenotype and gene level after cultivation on the ground. After 79 days of spaceflight, significant changes in S. warneri were observed. The phosphotransferase system of S. warneri was upregulated by long-term space stimulation, which resulted in a series of changes in the cell wall, biofilm, and chemical sensitivity, thus enhancing the resistance and adaptability of the bacterium to the external environment. © 2019 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.


April 21, 2020

Profiling the genome-wide landscape of tandem repeat expansions.

Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington’s Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome harbors hundreds of thousands of TRs, analysis of TR expansions has been mainly limited to known pathogenic loci. A major challenge is that expanded repeats are beyond the read length of most next-generation sequencing (NGS) datasets and are not profiled by existing genome-wide tools. We present GangSTR, a novel algorithm for genome-wide genotyping of both short and expanded TRs. GangSTR extracts information from paired-end reads into a unified model to estimate maximum likelihood TR lengths. We validate GangSTR on real and simulated data and show that GangSTR outperforms alternative methods in both accuracy and speed. We apply GangSTR to a deeply sequenced trio to profile the landscape of TR expansions in a healthy family and validate novel expansions using orthogonal technologies. Our analysis reveals that healthy individuals harbor dozens of long TR alleles not captured by current genome-wide methods. GangSTR will likely enable discovery of novel disease-associated variants not currently accessible from NGS. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020

SMRT sequencing analysis reveals the full-length transcripts and alternative splicing patterns in Ananas comosus var. bracteatus.

Ananas comosus var. bracteatus is an herbaceous perennial monocot cultivated as an ornamental plant for its chimeric leaves. Because of its genomic complexity, and because no genomic information is available in the public GenBank database, the complete structure of the mRNA transcript is unclear and there are limited molecular mechanism studies for Ananas comosus var. bracteatus.Three size fractionated full-length cDNA libraries (1-2 kb, 2-3 kb, and 3-6 kb) were constructed and subsequently sequenced in five single-molecule real-time (SMRT) cells (2 cells, 2 cells, and 1 cell, respectively).In total, 19,838 transcripts were identified for alternative splicing (AS) analysis. Among them, 19,185 (96.7%) transcripts were functionally annotated. A total of 9,921 genes were identified by mapping the non-redundant isoforms to the reference genome. A total of 10,649 AS events were identified, the majority of which were intron retention events. The alternatively spliced genes had functions in the basic metabolism processes of the plant such as carbon metabolism, amino acid biosynthesis, and glycolysis. Fourteen genes related to chlorophyll biosynthesis were identified as having AS events. The distribution of the splicing sites and the percentage of conventional and non-canonical AS sites of the genes categorized in pathways related to the albino leaf phenotype (ko00860, ko00195, ko00196, and ko00710) varied greatly. The present results showed that there were 8,316 genes carrying at least one poly (A) site, which generated 21,873 poly (A) sites. These findings indicated that the quality of the gene structure and functional information of the obtained genome was greatly improved, which may facilitate further genetic study of Ananas comosus var. bracteatus.


April 21, 2020

Genes of the pig, Sus scrofa, reconstructed with EvidentialGene.

The pig is a well-studied model animal of biomedical and agricultural importance. Genes of this species, Sus scrofa, are known from experiments and predictions, and collected at the NCBI reference sequence database section. Gene reconstruction from transcribed gene evidence of RNA-seq now can accurately and completely reproduce the biological gene sets of animals and plants. Such a gene set for the pig is reported here, including human orthologs missing from current NCBI and Ensembl reference pig gene sets, additional alternate transcripts, and other improvements. Methodology for accurate and complete gene set reconstruction from RNA is used: the automated SRA2Genes pipeline of EvidentialGene project.


April 21, 2020

Rapid antigen diversification through mitotic recombination in the human malaria parasite Plasmodium falciparum.

Malaria parasites possess the remarkable ability to maintain chronic infections that fail to elicit a protective immune response, characteristics that have stymied vaccine development and cause people living in endemic regions to remain at risk of malaria despite previous exposure to the disease. These traits stem from the tremendous antigenic diversity displayed by parasites circulating in the field. For Plasmodium falciparum, the most virulent of the human malaria parasites, this diversity is exemplified by the variant gene family called var, which encodes the major surface antigen displayed on infected red blood cells (RBCs). This gene family exhibits virtually limitless diversity when var gene repertoires from different parasite isolates are compared. Previous studies indicated that this remarkable genome plasticity results from extensive ectopic recombination between var genes during mitotic replication; however, the molecular mechanisms that direct this process to antigen-encoding loci while the rest of the genome remains relatively stable were not determined. Using targeted DNA double-strand breaks (DSBs) and long-read whole-genome sequencing, we show that a single break within an antigen-encoding region of the genome can result in a cascade of recombination events leading to the generation of multiple chimeric var genes, a process that can greatly accelerate the generation of diversity within this family. We also found that recombinations did not occur randomly, but rather high-probability, specific recombination products were observed repeatedly. These results provide a molecular basis for previously described structured rearrangements that drive diversification of this highly polymorphic gene family.


April 21, 2020

Whole genome sequencing of a novel, dichloromethane-fermenting Peptococcaceae from an enrichment culture

Bacteria capable of dechlorinating the toxic environmental contaminant dichloromethane (DCM, CHt2Cl2) are of great interest for potential bioremediation applications. A novel, strictly anaerobic, DCM-fermenting bacterium, “DCMF”, was enriched from organochlorine-contaminated groundwater near Botany Bay, Australia. The enrichment culture was maintained in minimal, mineral salt medium amended with dichloromethane as the sole energy source. PacBio whole genome SMRTtextsuperscriptTM sequencing of DCMF allowed textitde novo, gap-free assembly despite the presence of cohabiting organisms in the culture. Illumina sequencing reads were utilised to correct minor indels. The single, circularised 6.44 Mb chromosome was annotated with the IMG pipeline and contains 5,773 predicted protein-coding genes. Based on 16S rRNA gene and predicted proteome phylogeny, the organism appears to be a novel member of the textitPeptococcaceae family. The DCMF genome is large in comparison to known DCM-fermenting bacteria and includes 96 predicted methylamine methyltransferases, which may provide clues to the basis of its DCM metabolism. Full annotation has been provided in a custom genome browser and search tool, in addition to multiple sequence alignments and phylogenetic trees for every predicted protein, available at http://www.slimsuite.unsw.edu.au/research/dcmf/.


April 21, 2020

Deep repeat resolution-the assembly of the Drosophila Histone Complex.

Though the advent of long-read sequencing technologies has led to a leap in contiguity of de novo genome assemblies, current reference genomes of higher organisms still do not provide unbroken sequences of complete chromosomes. Despite reads in excess of 30 000 base pairs, there are still repetitive structures that cannot be resolved by current state-of-the-art assemblers. The most challenging of these structures are tandemly arrayed repeats, which occur in the genomes of all eukaryotes. Untangling tandem repeat clusters is exceptionally difficult, since the rare differences between repeat copies are obscured by the high error rate of long reads. Solving this problem would constitute a major step towards computing fully assembled genomes. Here, we demonstrate by example of the Drosophila Histone Complex that via machine learning algorithms, it is possible to exploit the underlying distinguishing patterns of single nucleotide variants of repeats from very noisy data to resolve a large and highly conserved repeat cluster. The ideas explored in this paper are a first step towards the automated assembly of complex repeat structures and promise to be applicable to a wide range of eukaryotic genomes. © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020

Long-read amplicon denoising.

Long-read next-generation amplicon sequencing shows promise for studying complete genes or genomes from complex and diverse populations. Current long-read sequencing technologies have challenging error profiles, hindering data processing and incorporation into downstream analyses. Here we consider the problem of how to reconstruct, free of sequencing error, the true sequence variants and their associated frequencies from PacBio reads. Called ‘amplicon denoising’, this problem has been extensively studied for short-read sequencing technologies, but current solutions do not always successfully generalize to long reads with high indel error rates. We introduce two methods: one that runs nearly instantly and is very accurate for medium length reads and high template coverage, and another, slower method that is more robust when reads are very long or coverage is lower. On two Mock Virus Community datasets with ground truth, each sequenced on a different PacBio instrument, and on a number of simulated datasets, we compare our two approaches to each other and to existing algorithms. We outperform all tested methods in accuracy, with competitive run times even for our slower method, successfully discriminating templates that differ by a just single nucleotide. Julia implementations of Fast Amplicon Denoising (FAD) and Robust Amplicon Denoising (RAD), and a webserver interface, are freely available. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020

High-throughput amplicon sequencing of the full-length 16S rRNA gene with single-nucleotide resolution.

Targeted PCR amplification and high-throughput sequencing (amplicon sequencing) of 16S rRNA gene fragments is widely used to profile microbial communities. New long-read sequencing technologies can sequence the entire 16S rRNA gene, but higher error rates have limited their attractiveness when accuracy is important. Here we present a high-throughput amplicon sequencing methodology based on PacBio circular consensus sequencing and the DADA2 sample inference method that measures the full-length 16S rRNA gene with single-nucleotide resolution and a near-zero error rate. In two artificial communities of known composition, our method recovered the full complement of full-length 16S sequence variants from expected community members without residual errors. The measured abundances of intra-genomic sequence variants were in the integral ratios expected from the genuine allelic variants within a genome. The full-length 16S gene sequences recovered by our approach allowed Escherichia coli strains to be correctly classified to the O157:H7 and K12 sub-species clades. In human fecal samples, our method showed strong technical replication and was able to recover the full complement of 16S rRNA alleles in several E. coli strains. There are likely many applications beyond microbial profiling for which high-throughput amplicon sequencing of complete genes with single-nucleotide resolution will be of use. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020

Effector gene reshuffling involves dispensable mini-chromosomes in the wheat blast fungus.

Newly emerged wheat blast disease is a serious threat to global wheat production. Wheat blast is caused by a distinct, exceptionally diverse lineage of the fungus causing rice blast disease. Through sequencing a recent field isolate, we report a reference genome that includes seven core chromosomes and mini-chromosome sequences that harbor effector genes normally found on ends of core chromosomes in other strains. No mini-chromosomes were observed in an early field strain, and at least two from another isolate each contain different effector genes and core chromosome end sequences. The mini-chromosome is enriched in transposons occurring most frequently at core chromosome ends. Additionally, transposons in mini-chromosomes lack the characteristic signature for inactivation by repeat-induced point (RIP) mutation genome defenses. Our results, collectively, indicate that dispensable mini-chromosomes and core chromosomes undergo divergent evolutionary trajectories, and mini-chromosomes and core chromosome ends are coupled as a mobile, fast-evolving effector compartment in the wheat pathogen genome.


April 21, 2020

CD8 T cells targeting adapted epitopes in chronic HIV infection promote dendritic cell maturation and CD4 T cell trans-infection.

HIV-1 frequently escapes from CD8 T cell responses via HLA-I restricted adaptation, leading to the accumulation of adapted epitopes (AE). We previously demonstrated that AE compromise CD8 T cell responses during acute infection and are associated with poor clinical outcomes. Here, we examined the impact of AE on CD8 T cell responses and their biological relevance in chronic HIV infection (CHI). In contrast to acute infection, the majority of AE are immunogenic in CHI. Longitudinal analyses from acute to CHI showed an increased frequency and magnitude of AE-specific IFN? responses compared to NAE-specific ones. These AE-specific CD8 T cells also were more cytotoxic to CD4 T cells. In addition, AE-specific CD8 T cells expressed lower levels of PD1 and CD57, as well as higher levels of CD28, suggesting a more activated and less exhausted phenotype. During CHI, viral sequencing identified AE-encoding strains as the dominant quasispecies. Despite increased CD4 T cell cytotoxicity, CD8 T cells responding to AE promoted dendritic cell (DC) maturation and CD4 T cell trans-infection perhaps explaining why AE are predominant in CHI. Taken together, our data suggests that the emergence of AE-specific CD8 T cell responses in CHI confers a selective advantage to the virus by promoting DC-mediated CD4 T cell trans-infection.


April 21, 2020

Chromulinavorax destructans, a pathogen of microzooplankton that provides a window into the enigmatic candidate phylum Dependentiae.

Members of the major candidate phylum Dependentiae (a.k.a. TM6) are widespread across diverse environments from showerheads to peat bogs; yet, with the exception of two isolates infecting amoebae, they are only known from metagenomic data. The limited knowledge of their biology indicates that they have a long evolutionary history of parasitism. Here, we present Chromulinavorax destructans (Strain SeV1) the first isolate of this phylum to infect a representative from a widespread and ecologically significant group of heterotrophic flagellates, the microzooplankter Spumella elongata (Strain CCAP 955/1). Chromulinavorax destructans has a reduced 1.2 Mb genome that is so specialized for infection that it shows no evidence of complete metabolic pathways, but encodes an extensive transporter system for importing nutrients and energy in the form of ATP from the host. Its replication causes extensive reorganization and expansion of the mitochondrion, effectively surrounding the pathogen, consistent with its dependency on the host for energy. Nearly half (44%) of the inferred proteins contain signal sequences for secretion, including many without recognizable similarity to proteins of known function, as well as 98 copies of proteins with an ankyrin-repeat domain; ankyrin-repeats are known effectors of host modulation, suggesting the presence of an extensive host-manipulation apparatus. These observations help to cement members of this phylum as widespread and diverse parasites infecting a broad range of eukaryotic microbes.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.