Menu
September 22, 2019  |  

Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II’s sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.


September 22, 2019  |  

MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs.

There are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads. Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce increasing numbers of longer reads (of varying lengths in the range of 10k-1M bps, say), but of low quality. There is an increasing interest in using long reads in microbiome sequencing, and there is a need to adapt short read tools to long read datasets.We describe a new LCA-based algorithm for taxonomic binning, and an interval-tree based algorithm for functional binning, that are explicitly designed for long reads and assembled contigs. We provide a new interactive tool for investigating the alignment of long reads against reference sequences. For taxonomic and functional binning, we propose to use LAST to compare long reads against the NCBI-nr protein reference database so as to obtain frame-shift aware alignments, and then to process the results using our new methods.All presented methods are implemented in the open source edition of MEGAN, and we refer to this new extension as MEGAN-LR (MEGAN long read). We evaluate the LAST+MEGAN-LR approach in a simulation study, and on a number of mock community datasets consisting of Nanopore reads, PacBio reads and assembled PacBio reads. We also illustrate the practical application on a Nanopore dataset that we sequenced from an anammox bio-rector community.This article was reviewed by Nicola Segata together with Moreno Zolfo, Pete James Lockhart and Serghei Mangul.This work extends the applicability of the widely-used metagenomic analysis software MEGAN to long reads. Our study suggests that the presented LAST+MEGAN-LR pipeline is sufficiently fast and accurate.


September 22, 2019  |  

Interpreting microbial biosynthesis in the genomic age: Biological and practical considerations.

Genome mining has become an increasingly powerful, scalable, and economically accessible tool for the study of natural product biosynthesis and drug discovery. However, there remain important biological and practical problems that can complicate or obscure biosynthetic analysis in genomic and metagenomic sequencing projects. Here, we focus on limitations of available technology as well as computational and experimental strategies to overcome them. We review the unique challenges and approaches in the study of symbiotic and uncultured systems, as well as those associated with biosynthetic gene cluster (BGC) assembly and product prediction. Finally, to explore sequencing parameters that affect the recovery and contiguity of large and repetitive BGCs assembled de novo, we simulate Illumina and PacBio sequencing of the Salinispora tropica genome focusing on assembly of the salinilactam (slm) BGC.


September 22, 2019  |  

Capturing single cell genomes of active polysaccharide degraders: an unexpected contribution of Verrucomicrobia.

Microbial hydrolysis of polysaccharides is critical to ecosystem functioning and is of great interest in diverse biotechnological applications, such as biofuel production and bioremediation. Here we demonstrate the use of a new, efficient approach to recover genomes of active polysaccharide degraders from natural, complex microbial assemblages, using a combination of fluorescently labeled substrates, fluorescence-activated cell sorting, and single cell genomics. We employed this approach to analyze freshwater and coastal bacterioplankton for degraders of laminarin and xylan, two of the most abundant storage and structural polysaccharides in nature. Our results suggest that a few phylotypes of Verrucomicrobia make a considerable contribution to polysaccharide degradation, although they constituted only a minor fraction of the total microbial community. Genomic sequencing of five cells, representing the most predominant, polysaccharide-active Verrucomicrobia phylotype, revealed significant enrichment in genes encoding a wide spectrum of glycoside hydrolases, sulfatases, peptidases, carbohydrate lyases and esterases, confirming that these organisms were well equipped for the hydrolysis of diverse polysaccharides. Remarkably, this enrichment was on average higher than in the sequenced representatives of Bacteroidetes, which are frequently regarded as highly efficient biopolymer degraders. These findings shed light on the ecological roles of uncultured Verrucomicrobia and suggest specific taxa as promising bioprospecting targets. The employed method offers a powerful tool to rapidly identify and recover discrete genomes of active players in polysaccharide degradation, without the need for cultivation.


September 22, 2019  |  

Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome.

The majority of microbial genomic diversity remains unexplored. This is largely due to our inability to culture most microorganisms in isolation, which is a prerequisite for traditional genome sequencing. Single-cell sequencing has allowed researchers to circumvent this limitation. DNA is amplified directly from a single cell using the whole-genome amplification technique of multiple displacement amplification (MDA). However, MDA from a single chromosome copy suffers from amplification bias and a large loss of specificity from even very small amounts of DNA contamination, which makes assembling a genome difficult and completely finishing a genome impossible except in extraordinary circumstances. Gel microdrop cultivation allows culturing of a diverse microbial community and provides hundreds to thousands of genetically identical cells as input for an MDA reaction. We demonstrate the utility of this approach by comparing sequencing results of gel microdroplets and single cells following MDA. Bias is reduced in the MDA reaction and genome sequencing, and assembly is greatly improved when using gel microdroplets. We acquired multiple near-complete genomes for two bacterial species from human oral and stool microbiome samples. A significant amount of genome diversity, including single nucleotide polymorphisms and genome recombination, is discovered. Gel microdroplets offer a powerful and high-throughput technology for assembling whole genomes from complex samples and for probing the pan-genome of naturally occurring populations.


September 22, 2019  |  

The Santa Pola saltern as a model for studying the microbiota of hypersaline environments.

Multi-pond salterns constitute an excellent model for the study of the microbial diversity and ecology of hypersaline environments, showing a wide range of salt concentrations, from seawater to salt saturation. Accumulated studies on the Santa Pola (Alicante, Spain) multi-pond solar saltern during the last 35 years include culture-dependent and culture-independent molecular methods and metagenomics more recently. These approaches have permitted to determine in depth the microbial diversity of the ponds with intermediate salinities (from 10 % salts) up to salt saturation, with haloarchaea and bacteria as the two main dominant groups. In this review, we describe the main results obtained using the different methodologies, the most relevant contributions for understanding the ecology of these extreme environments and the future perspectives for such studies.


September 22, 2019  |  

A high-resolution genetic map of the cereal crown rot pathogen Fusarium pseudograminearum provides a near-complete genome assembly.

Fusarium pseudograminearum is an important pathogen of wheat and barley, particularly in semi-arid environments. Previous genome assemblies for this organism were based entirely on short read data and are highly fragmented. In this work, a genetic map of F. pseudograminearum has been constructed for the first time based on a mapping population of 178 individuals. The genetic map, together with long read scaffolding of a short read-based genome assembly, was used to give a near-complete assembly of the four F. pseudograminearum chromosomes. Large regions of synteny between F. pseudograminearum and F. graminearum, the related pathogen that is the primary causal agent of cereal head blight disease, were previously proposed in the core conserved genome, but the construction of a genetic map to order and orient contigs is critical to the validation of synteny and the placing of species-specific regions. Indeed, our comparative analyses of the genomes of these two related pathogens suggest that rearrangements in the F. pseudograminearum genome have occurred in the chromosome ends. One of these rearrangements includes the transposition of an entire gene cluster involved in the detoxification of the benzoxazolinone (BOA) class of plant phytoalexins. This work provides an important genomic and genetic resource for F. pseudograminearum, which is less well characterized than F. graminearum. In addition, this study provides new insights into a better understanding of the sexual reproduction process in F. pseudograminearum, which informs us of the potential of this pathogen to evolve.© 2016 BSPP AND JOHN WILEY & SONS LTD.


September 22, 2019  |  

Complete genome sequence and analysis of the industrial Saccharomyces cerevisiae strain N85 used in Chinese rice wine production.

Chinese rice wine is a popular traditional alcoholic beverage in China, while its brewing processes have rarely been explored. We herein report the first gapless, near-finished genome sequence of the yeast strain Saccharomyces cerevisiae N85 for Chinese rice wine production. Several assembly methods were used to integrate Pacific Bioscience (PacBio) and Illumina sequencing data to achieve high-quality genome sequencing of the strain. The genome encodes more than 6,000 predicted proteins, and 238 long non-coding RNAs, which are validated by RNA-sequencing data. Moreover, our annotation predicts 171 novel genes that are not present in the reference S288c genome. We also identified 65,902 single nucleotide polymorphisms and small indels, many of which are located within genic regions. Dozens of larger copy-number variations and translocations were detected, mainly enriched in the subtelomeres, suggesting these regions may be related to genomic evolution. This study will serve as a milestone in studying of Chinese rice wine and related beverages in China and in other countries. It will help to develop more scientific and modern fermentation processes of Chinese rice wine, and explore metabolism pathways of desired and harmful components in Chinese rice wine to improve its taste and nutritional value.© The Author(s) 2018. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.


September 22, 2019  |  

Characterisation of pathogen-specific regions and novel effector candidates in Fusarium oxysporum f. sp. cepae.

A reference-quality assembly of Fusarium oxysporum f. sp. cepae (Foc), the causative agent of onion basal rot has been generated along with genomes of additional pathogenic and non-pathogenic isolates of onion. Phylogenetic analysis confirmed a single origin of the Foc pathogenic lineage. Genome alignments with other F. oxysporum ff. spp. and non pathogens revealed high levels of syntenic conservation of core chromosomes but little synteny between lineage specific (LS) chromosomes. Four LS contigs in Foc totaling 3.9?Mb were designated as pathogen-specific (PS). A two-fold increase in segmental duplication events was observed between LS regions of the genome compared to within core regions or from LS regions to the core. RNA-seq expression studies identified candidate effectors expressed in planta, consisting of both known effector homologs and novel candidates. FTF1 and a subset of other transcription factors implicated in regulation of effector expression were found to be expressed in planta.


September 22, 2019  |  

Convergent evolution of complex genomic rearrangements in two fungal meiotic drive elements.

Meiotic drive is widespread in nature. The conflict it generates is expected to be an important motor for evolutionary change and innovation. In this study, we investigated the genomic consequences of two large multi-gene meiotic drive elements, Sk-2 and Sk-3, found in the filamentous ascomycete Neurospora intermedia. Using long-read sequencing, we generated the first complete and well-annotated genome assemblies of large, highly diverged, non-recombining regions associated with meiotic drive elements. Phylogenetic analysis shows that, even though Sk-2 and Sk-3 are located in the same chromosomal region, they do not form sister clades, suggesting independent origins or at least a long evolutionary separation. We conclude that they have in a convergent manner accumulated similar patterns of tandem inversions and dense repeat clusters, presumably in response to similar needs to create linkage between genes causing drive and resistance.


September 22, 2019  |  

Genomic insights into multidrug-resistance, mating and virulence in Candida auris and related emerging species.

Candida auris is an emergent multidrug-resistant fungal pathogen causing increasing reports of outbreaks. While distantly related to C. albicans and C. glabrata, C. auris is closely related to rarely observed and often multidrug-resistant species from the C. haemulonii clade. Here, we analyze near complete genome assemblies for the four C. auris clades and three related species, and map intra- and inter-species rearrangements across the seven chromosomes. Using RNA-Seq-guided gene predictions, we find that most mating and meiosis genes are conserved and that clades contain either the MTLa or MTLa mating loci. Comparing the genomes of these emerging species to those of other Candida species identifies genes linked to drug resistance and virulence, including expanded families of transporters and lipases, as well as mutations and copy number variants in ERG11. Gene expression analysis identifies transporters and metabolic regulators specific to C. auris and those conserved with related species which may contribute to differences in drug response in this emerging fungal clade.


July 19, 2019  |  

Single molecule sequencing and genome assembly of a clinical specimen of Loa loa, the causative agent of loiasis.

More than 20% of the world’s population is at risk for infection by filarial nematodes and >180 million people worldwide are already infected. Along with infection comes significant morbidity that has a socioeconomic impact. The eight filarial nematodes that infect humans are Wuchereria bancrofti, Brugia malayi, Brugia timori, Onchocerca volvulus, Loa loa, Mansonella perstans, Mansonella streptocerca, and Mansonella ozzardi, of which three have published draft genome sequences. Since all have humans as the definitive host, standard avenues of research that rely on culturing and genetics have often not been possible. Therefore, genome sequencing provides an important window into understanding the biology of these parasites. The need for large amounts of high quality genomic DNA from homozygous, inbred lines; the availability of only short sequence reads from next-generation sequencing platforms at a reasonable expense; and the lack of random large insert libraries has limited our ability to generate high quality genome sequences for these parasites. However, the Pacific Biosciences single molecule, real-time sequencing platform holds great promise in reducing input amounts and generating sufficiently long sequences that bypass the need for large insert paired libraries.Here, we report on efforts to generate a more complete genome assembly for L. loa using genetically heterogeneous DNA isolated from a single clinical sample and sequenced on the Pacific Biosciences platform. To obtain the best assembly, numerous assemblers and sequencing datasets were analyzed, combined, and compared. Quiver-informed trimming of an assembly of only Pacific Biosciences reads by HGAP2 was selected as the final assembly of 96.4 Mbp in 2,250 contigs. This results in ~9% more of the genome in ~85% fewer contigs from ~80% less starting material at a fraction of the cost of previous Roche 454-based sequencing efforts.The result is the most complete filarial nematode assembly produced thus far and demonstrates the utility of single molecule sequencing on the Pacific Biosciences platform for genetically heterogeneous metazoan genomes.


July 19, 2019  |  

Recently published Streptomyces genome sequences.

Many readers of this journal will need no introduction to the bacterial genus Streptomyces, which includes several hundred species, many of which produce biotechnologically useful secondary metabolites. The last 2 years have seen numerous publications describing Streptomyces genome sequences (Table?1), mostly as short genome announcements restricted to just 500 words and therefore allowing little description and analysis. Our aim in this current manuscript is to survey these recent publications and to dig a little deeper where appropriate. The genus Streptomyces is now one of the most highly sequenced, with 19 finished genomic sequences (Table?2) and a further 125 draft assemblies available in the GenBank database as of 3rd of May 2014; by the time this is published, no doubt there will be more. The reasons given for sequencing this latest crop of Streptomyces include production of industrially important enzymes, degradation of lignin, antibiotic production, rapid growth and halo-tolerance and an endophytic lifestyle (Table?1).


July 19, 2019  |  

Chaos of rearrangements in the mating-type chromosomes of the anther-smut fungus Microbotryum lychnidis-dioicae.

Sex chromosomes in plants and animals and fungal mating-type chromosomes often show exceptional genome features, with extensive suppression of homologous recombination and cytological differentiation between members of the diploid chromosome pair. Despite strong interest in the genetics of these chromosomes, their large regions of suppressed recombination often are enriched in transposable elements and therefore can be challenging to assemble. Here we show that the latest improvements of the PacBio sequencing yield assembly of the whole genome of the anther-smut fungus, Microbotryum lychnidis-dioicae (the pathogenic fungus causing anther-smut disease of Silene latifolia), into finished chromosomes or chromosome arms, even for the repeat-rich mating-type chromosomes and centromeres. Suppressed recombination of the mating-type chromosomes is revealed to span nearly 90% of their lengths, with extreme levels of rearrangements, transposable element accumulation, and differentiation between the two mating types. We observed no correlation between allelic divergence and physical position in the nonrecombining regions of the mating-type chromosomes. This may result from gene conversion or from rearrangements of ancient evolutionary strata, i.e., successive steps of suppressed recombination. Centromeres were found to be composed mainly of copia-like transposable elements and to possess specific minisatellite repeats identical between the different chromosomes. We also identified subtelomeric motifs. In addition, extensive signs of degeneration were detected in the nonrecombining regions in the form of transposable element accumulation and of hundreds of gene losses on each mating-type chromosome. Furthermore, our study highlights the potential of the latest breakthrough PacBio chemistry to resolve complex genome architectures. Copyright © 2015 by the Genetics Society of America.


July 19, 2019  |  

Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum.

Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16?kilobases) reads with random errors, we assembled 99% (244?megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4?megabases. Oropetium is an example of a ‘near-complete’ draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.