April 21, 2020  |  

Improved assembly and variant detection of a haploid human genome using single-molecule, high-fidelity long reads.

The sequence and assembly of human genomes using long-read sequencing technologies has revolutionized our understanding of structural variation and genome organization. We compared the accuracy, continuity, and gene annotation of genome assemblies generated from either high-fidelity (HiFi) or continuous long-read (CLR) datasets from the same complete hydatidiform mole human genome. We find that the HiFi sequence data assemble an additional 10% of duplicated regions and more accurately represent the structure of tandem repeats, as validated with orthogonal analyses. As a result, an additional 5 Mbp of pericentromeric sequences are recovered in the HiFi assembly, resulting in a 2.5-fold increase in the NG50 within 1 Mbp of the centromere (HiFi 480.6 kbp, CLR 191.5 kbp). Additionally, the HiFi genome assembly was generated in significantly less time with fewer computational resources than the CLR assembly. Although the HiFi assembly has significantly improved continuity and accuracy in many complex regions of the genome, it still falls short of the assembly of centromeric DNA and the largest regions of segmental duplication using existing assemblers. Despite these shortcomings, our results suggest that HiFi may be the most effective standalone technology for de novo assembly of human genomes. © 2019 John Wiley & Sons Ltd/University College London.


April 21, 2020  |  

RNA sequencing: the teenage years.

Over the past decade, RNA sequencing (RNA-seq) has become an indispensable tool for transcriptome-wide analysis of differential gene expression and differential splicing of mRNAs. However, as next-generation sequencing technologies have developed, so too has RNA-seq. Now, RNA-seq methods are available for studying many different aspects of RNA biology, including single-cell gene expression, translation (the translatome) and RNA structure (the structurome). Exciting new applications are being explored, such as spatial transcriptomics (spatialomics). Together with new long-read and direct RNA-seq technologies and better computational tools for data analysis, innovations in RNA-seq are contributing to a fuller understanding of RNA biology, from questions such as when and where transcription occurs to the folding and intermolecular interactions that govern RNA function.


April 21, 2020  |  

Complete genome of Pseudomonas sp. DMSP-1 isolated from the Arctic seawater of Kongsfjorden, Svalbard

The genus Pseudomonas is highly metabolically diverse and has colonized a wide range of ecological niches. The strain Pseudomonas sp. DMSP-1 was isolated from Arctic seawater (Kongsfjorden, Svalbard) using dimethylsulfoniopropionate (DMSP) as the sole carbon source. To better understand its role in the Arctic coastal ecosystem, the genome of Pseudomonas sp. strain DMSP-1 was completely sequenced. The genome contained a circular chromosome of 6,282,445?bp with an average GC content of 60.01?mol%. A total of 5510 protein coding genes, 70 tRNA genes and 19 rRNA genes were obtained. However, no genes encoding known enzymes associated with DMSP catabolism were identified in the genome, suggesting that novel DMSP degradation genes might exist in Pseudomonas sp. strain DMSP-1.


April 21, 2020  |  

Single-Cell Virus Sequencing of Influenza Infections That Trigger Innate Immunity.

Influenza virus-infected cells vary widely in their expression of viral genes and only occasionally activate innate immunity. Here, we develop a new method to assess how the genetic variation in viral populations contributes to this heterogeneity. We do this by determining the transcriptome and full-length sequences of all viral genes in single cells infected with a nominally “pure” stock of influenza virus. Most cells are infected by virions with defects, some of which increase the frequency of innate-immune activation. These immunostimulatory defects are diverse and include mutations that perturb the function of the viral polymerase protein PB1, large internal deletions in viral genes, and failure to express the virus’s interferon antagonist NS1. However, immune activation remains stochastic in cells infected by virions with these defects and occasionally is triggered even by virions that express unmutated copies of all genes. Our work shows that the diverse spectrum of defects in influenza virus populations contributes to-but does not completely explain-the heterogeneity in viral gene expression and immune activation in single infected cells.IMPORTANCE Because influenza virus has a high mutation rate, many cells are infected by mutated virions. But so far, it has been impossible to fully characterize the sequence of the virion infecting any given cell, since conventional techniques such as flow cytometry and single-cell transcriptome sequencing (scRNA-seq) only detect if a protein or transcript is present, not its sequence. Here we develop a new approach that uses long-read PacBio sequencing to determine the sequences of virions infecting single cells. We show that viral genetic variation explains some but not all of the cell-to-cell variability in viral gene expression and innate immune induction. Overall, our study provides the first complete picture of how viral mutations affect the course of infection in single cells.Copyright © 2019 Russell et al.


April 21, 2020  |  

Genome assembly and annotation of the Trichoplusia ni Tni-FNL insect cell line enabled by long-read technologies.

Trichoplusiani derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in this host, we performed de novo genome assembly of the Trichoplusiani-derived cell line Tni-FNL.By integration of PacBio single-molecule sequencing, Bionano optical mapping, and 10X Genomics linked-reads data, we have produced a draft genome assembly of Tni-FNL.Our assembly contains 280 scaffolds, with a N50 scaffold size of 2.3 Mb and a total length of 359 Mb. Annotation of the Tni-FNL genome resulted in 14,101 predicted genes and 93.2% of the predicted proteome contained recognizable protein domains. Ortholog searches within the superorder Holometabola provided further evidence of high accuracy and completeness of the Tni-FNL genome assembly.This first draft Tni-FNL genome assembly was enabled by complementary long-read technologies and represents a high-quality, well-annotated genome that provides novel insight into the complexity of this insect cell line and can serve as a reference for future large-scale genome engineering work in this and other similar recombinant protein production hosts.


April 21, 2020  |  

Physiological properties and genetic analysis related to exopolysaccharide (EPS) production in the fresh-water unicellular cyanobacterium Aphanothece sacrum (Suizenji Nori).

The clonal strains, phycoerythrin(PE)-rich- and PE-poor strains, of the unicellular, fresh water cyanobacterium Aphanothece sacrum (Suringar) Okada (Suizenji Nori, in Japanese) were isolated from traditional open-air aquafarms in Japan. A. sacrum appeared to be oligotrophic on the basis of its growth characteristics. The optimum temperature for growth was around 20°C. Maximum growth and biomass increase at 20°C was obtained under light intensities between 40 to 80 µmol m-2 s-1 (fluorescent lamps, 12 h light/12 h dark cycles) and between 40 to 120 µmol m-2 s-1 for PE-rich and PE-poor strains, respectively, of A. sacrum . Purified exopolysaccharide (EPS) of A. sacrum has a molecular weight of ca. 104 kDa with five major monosaccharides (glucose, xylose, rhamnose, galactose and mannose; =85 mol%). We also deciphered the whole genome sequence of the two strains of A. sacrum. The putative genes involved in the polymerization, chain length control, and export of EPS would contribute to understand the biosynthetic process of their extremely high molecular weight EPS. The putative genes encoding Wzx-Wzy-Wzz- and Wza-Wzb-Wzc were conserved in the A. sacrum strains FPU1 and FPU3. This result suggests that the Wzy-dependent pathway participates in the EPS production of A. sacrum.


April 21, 2020  |  

Combined Genome and Transcriptome (G&T) Sequencing of Single Cells.

The simultaneous examination of a single cell’s genome and transcriptome presents scientists with a powerful tool to study genetic variability and its effect on gene expression. In this chapter, we describe the library generation method for combined genome and transcriptome sequencing (G&T-seq) originally described by Macaulay et al. (Nat Protoc 11(11):2081-2103, 2016; Nat Methods 12(6):519-522, 2015). This includes some alterations we made to improve robustness of this process for both the novice user and laboratories that want to deploy this method at scale. Using this method, genomic DNA and full-length mRNA from single cells are separated, amplified, and converted into Illumina sequencer-compatible sequencing libraries.


April 21, 2020  |  

Human Migration and the Spread of the Nematode Parasite Wuchereria bancrofti.

The human disease lymphatic filariasis causes the debilitating effects of elephantiasis and hydrocele. Lymphatic filariasis currently affects the lives of 90 million people in 52 countries. There are three nematodes that cause lymphatic filariasis, Brugia malayi, Brugia timori, and Wuchereria bancrofti, but 90% of all cases of lymphatic filariasis are caused solely by W. bancrofti (Wb). Here we use population genomics to reconstruct the probable route and timing of migration of Wb strains that currently infect Africa, Haiti, and Papua New Guinea (PNG). We used selective whole genome amplification to sequence 42 whole genomes of single Wb worms from populations in Haiti, Mali, Kenya, and PNG. Our results are consistent with a hypothesis of an Island Southeast Asia or East Asian origin of Wb. Our demographic models support divergence times that correlate with the migration of human populations. We hypothesize that PNG was infected at two separate times, first by the Melanesians and later by the migrating Austronesians. The migrating Austronesians also likely introduced Wb to Madagascar where later migrations spread it to continental Africa. From Africa, Wb spread to the New World during the transatlantic slave trade. Genome scans identified 17 genes that were highly differentiated among Wb populations. Among these are genes associated with human immune suppression, insecticide sensitivity, and proposed drug targets. Identifying the distribution of genetic diversity in Wb populations and selection forces acting on the genome will build a foundation to test future hypotheses and help predict response to current eradication efforts. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


April 21, 2020  |  

A siphonous macroalgal genome suggests convergent functions of homeobox genes in algae and land plants.

Genome evolution and development of unicellular, multinucleate macroalgae (siphonous algae) are poorly known, although various multicellular organisms have been studied extensively. To understand macroalgal developmental evolution, we assembled the ~26?Mb genome of a siphonous green alga, Caulerpa lentillifera, with high contiguity, containing 9,311 protein-coding genes. Molecular phylogeny using 107 nuclear genes indicates that the diversification of the class Ulvophyceae, including C. lentillifera, occurred before the split of the Chlorophyceae and Trebouxiophyceae. Compared with other green algae, the TALE superclass of homeobox genes, which expanded in land plants, shows a series of lineage-specific duplications in this siphonous macroalga. Plant hormone signalling components were also expanded in a lineage-specific manner. Expanded transport regulators, which show spatially different expression, suggest that the structural patterning strategy of a multinucleate cell depends on diversification of nuclear pore proteins. These results not only imply functional convergence of duplicated genes among green plants, but also provide insight into evolutionary roots of green plants. Based on the present results, we propose cellular and molecular mechanisms involved in the structural differentiation in the siphonous alga. © The Author(s) 2019. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.


April 21, 2020  |  

Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes.

Metagenomic samples are snapshots of complex ecosystems at work. They comprise hundreds of known and unknown species, contain multiple strain variants and vary greatly within and across environments. Many microbes found in microbial communities are not easily grown in culture making their DNA sequence our only clue into their evolutionary history and biological function. Metagenomic assembly is a computational process aimed at reconstructing genes and genomes from metagenomic mixtures. Current methods have made significant strides in reconstructing DNA segments comprising operons, tandem gene arrays and syntenic blocks. Shorter, higher-throughput sequencing technologies have become the de facto standard in the field. Sequencers are now able to generate billions of short reads in only a few days. Multiple metagenomic assembly strategies, pipelines and assemblers have appeared in recent years. Owing to the inherent complexity of metagenome assembly, regardless of the assembly algorithm and sequencing method, metagenome assemblies contain errors. Recent developments in assembly validation tools have played a pivotal role in improving metagenomics assemblers. Here, we survey recent progress in the field of metagenomic assembly, provide an overview of key approaches for genomic and metagenomic assembly validation and demonstrate the insights that can be derived from assemblies through the use of assembly validation strategies. We also discuss the potential for impact of long-read technologies in metagenomics. We conclude with a discussion of future challenges and opportunities in the field of metagenomic assembly and validation. © The Author 2017. Published by Oxford University Press.


April 21, 2020  |  

Development of CRISPR-Cas systems for genome editing and beyond

The development of clustered regularly interspaced short-palindromic repeat (CRISPR)-Cas systems for genome editing has transformed the way life science research is conducted and holds enormous potential for the treatment of disease as well as for many aspects of biotech- nology. Here, I provide a personal perspective on the development of CRISPR-Cas9 for genome editing within the broader context of the field and discuss our work to discover novel Cas effectors and develop them into additional molecular tools. The initial demonstra- tion of Cas9-mediated genome editing launched the development of many other technologies, enabled new lines of biological inquiry, and motivated a deeper examination of natural CRISPR-Cas systems, including the discovery of new types of CRISPR-Cas systems. These new discoveries in turn spurred further technological developments. I review these exciting discoveries and technologies as well as provide an overview of the broad array of applications of these technologies in basic research and in the improvement of human health. It is clear that we are only just beginning to unravel the potential within microbial diversity, and it is quite likely that we will continue to discover other exciting phenomena, some of which it may be possible to repurpose as molecular technologies. The transformation of mysterious natural phenomena to powerful tools, however, takes a collective effort to discover, characterize, and engineer them, and it has been a privilege to join the numerous researchers who have contributed to this transformation of CRISPR-Cas systems.


April 21, 2020  |  

Metatranscriptomic evidence for classical and RuBisCO-mediated CO2 reduction to methane facilitated by direct interspecies electron transfer in a methanogenic system.

In a staged anaerobic fluidized-bed ceramic membrane bioreactor, metagenomic and metatranscriptomic analyses were performed to decipher the microbial interactions on the granular activated carbon. Metagenome bins, representing the predominating microbes in the bioreactor: syntrophic propionate-oxidizing bacteria (SPOB), acetoclastic Methanothrix concilii, and exoelectrogenic Geobacter lovleyi, were successfully recovered for the reconstruction and analysis of metabolic pathways involved in the transformation of fatty acids to methane. In particular, SPOB degraded propionate into acetate, which was further converted into methane and CO2 by M. concilii via the acetoclastic methanogenesis. Concurrently, G. lovleyi oxidized acetate into CO2, releasing electrons into the extracellular environment. By accepting these electrons through direct interspecies electron transfer (DIET), M. concilii was capable of performing CO2 reduction for further methane formation. Most notably, an alternative RuBisCO-mediated CO2 reduction (the reductive hexulose-phosphate (RHP) pathway) is transcriptionally-active in M. concilii. This RHP pathway enables M. concilii dominance and energy gain by carbon fixation and methanogenesis, respectively via a methyl-H4MPT intermediate, constituting the third methanogenesis route. The complete acetate reduction (2 mole methane formation/1 mole acetate consumption), coupling of acetoclastic methanogenesis and two CO2 reduction pathways, are thermodynamically favorable even under very low substrate condition (down to to 10-5?M level). Such tight interactions via both mediated and direct interspecies electron transfer (MIET and DIET), induced by the conductive GAC promote the overall efficiency of bioenergy processes.


April 21, 2020  |  

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50?bp) and 27,622 SVs (=50?bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.


April 21, 2020  |  

Platanus-allee is a de novo haplotype assembler enabling a comprehensive access to divergent heterozygous regions.

The ultimate goal for diploid genome determination is to completely decode homologous chromosomes independently, and several phasing programs from consensus sequences have been developed. These methods work well for lowly heterozygous genomes, but the manifold species have high heterozygosity. Additionally, there are highly divergent regions (HDRs), where the haplotype sequences differ considerably. Because HDRs are likely to direct various interesting biological phenomena, many genomic analysis targets fall within these regions. However, they cannot be accessed by existing phasing methods, and we have to adopt costly traditional methods. Here, we develop a de novo haplotype assembler, Platanus-allee ( http://platanus.bio.titech.ac.jp/platanus2 ), which initially constructs each haplotype sequence and then untangles the assembly graphs utilizing sequence links and synteny information. A comprehensive benchmark analysis reveals that Platanus-allee exhibits high recall and precision, particularly for HDRs. Using this approach, previously unknown HDRs are detected in the human genome, which may uncover novel aspects of genome variability.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.