Menu
July 19, 2019

SMRT genome assembly corrects reference errors, resolving the genetic basis of virulence in Mycobacterium tuberculosis.

The genetic basis of virulence in Mycobacterium tuberculosis has been investigated through genome comparisons of virulent (H37Rv) and attenuated (H37Ra) sister strains. Such analysis, however, relies heavily on the accuracy of the sequences. While the H37Rv reference genome has had several corrections to date, that of H37Ra is unmodified since its original publication.Here, we report the assembly and finishing of the H37Ra genome from single-molecule, real-time (SMRT) sequencing. Our assembly reveals that the number of H37Ra-specific variants is less than half of what the Sanger-based H37Ra reference sequence indicates, undermining and, in some cases, invalidating the conclusions of several studies. PE_PPE family genes, which are intractable to commonly-used sequencing platforms because of their repetitive and GC-rich nature, are overrepresented in the set of genes in which all reported H37Ra-specific variants are contradicted. Further, one of the sequencing errors in H37Ra masks a true variant in common with the clinical strain CDC1551 which, when considered in the context of previous work, corresponds to a sequencing error in the H37Rv reference genome.Our results constrain the set of genomic differences possibly affecting virulence by more than half, which focuses laboratory investigation on pertinent targets and demonstrates the power of SMRT sequencing for producing high-quality reference genomes.


July 19, 2019

Comparative genomics of two sequential Candida glabrata clinical isolates.

Candida glabrata is an important fungal pathogen which develops rapid antifungal resistance in treated patients. It is known that azole treatments lead to antifungal resistance in this fungal species and that multidrug efflux transporters are involved in this process. Specific mutations in the transcriptional regulator PDR1 result in upregulation of the transporters. In addition, we showed that the PDR1 mutations can contribute to enhance virulence in animal models. In this study, we were interested to compare genomes of two specific C. glabrata-related isolates, one of which was azole susceptible (DSY562) while the other was azole resistant (DSY565). DSY565 contained a PDR1 mutation (L280F) and was isolated after a time-lapse of 50 d of azole therapy. We expected that genome comparisons between both isolates could reveal additional mutations reflecting host adaptation or even additional resistance mechanisms. The PacBio technology used here yielded 14 major contigs (sizes 0.18-1.6 Mb) and mitochondrial genomes from both DSY562 and DSY565 isolates that were highly similar to each other. Comparisons of the clinical genomes with the published CBS138 genome indicated important genome rearrangements, but not between the clinical strains. Among the unique features, several retrotransposons were identified in the genomes of the investigated clinical isolates. DSY562 and DSY565 each contained a large set of adhesin-like genes (101 and 107, respectively), which exceed by far the number of reported adhesins (63) in the CBS138 genome. Comparison between DSY562 and DSY565 yielded 17 nonsynonymous SNPs (among which the was the expected PDR1 mutation) as well as small size indels in coding regions (11) but mainly in adhesin-like genes. The genomes contained a DNA mismatch repair allele of MSH2 known to be involved in the so-called hyper-mutator phenotype of this yeast species and the number of accumulated mutations between both clinical isolates is consistent with the presence of a MSH2 defect. In conclusion, this study is the first to compare genomes of C. glabrata sequential clinical isolates using the PacBio technology as an approach. The genomes of these isolates taken in the same patient at two different time points exhibited limited variations, even if submitted to the host pressure. Copyright © 2017 Vale-Silva et al.


July 19, 2019

Reduction in chromosome mobility accompanies nuclear organization during early embryogenesis in Caenorhabditis elegans.

In differentiated cells, chromosomes are packed inside the cell nucleus in an organised fashion. In contrast, little is known about how chromosomes are packed in undifferentiated cells and how nuclear organization changes during development. To assess changes in nuclear organization during the earliest stages of development, we quantified the mobility of a pair of homologous chromosomal loci in the interphase nuclei of Caenorhabditis elegans embryos. The distribution of distances between homologous loci was consistent with a random distribution up to the 8-cell stage but not at later stages. The mobility of the loci was significantly reduced from the 2-cell to the 48-cell stage. Nuclear foci corresponding to epigenetic marks as well as heterochromatin and the nucleolus also appeared around the 8-cell stage. We propose that the earliest global transformation in nuclear organization occurs at the 8-cell stage during C. elegans embryogenesis.


July 19, 2019

Iterative optimization of xylose catabolism in Saccharomyces cerevisiae using combinatorial expression tuning.

A common challenge in metabolic engineering is rapidly identifying rate-controlling enzymes in heterologous pathways for subsequent production improvement. We demonstrate a workflow to address this challenge and apply it to improving xylose utilization in Saccharomyces cerevisiae. For eight reactions required for conversion of xylose to ethanol, we screened enzymes for functional expression in S. cerevisiae, followed by a combinatorial expression analysis to achieve pathway flux balancing and identification of limiting enzymatic activities. In the next round of strain engineering, we increased the copy number of these limiting enzymes and again tested the eight-enzyme combinatorial expression library in this new background. This workflow yielded a strain that has a ~70% increase in biomass yield and ~240% increase in xylose utilization. Finally, we chromosomally integrated the expression library. This library enriched for strains with multiple integrations of the pathway, which likely were the result of tandem integrations mediated by promoter homology. Biotechnol. Bioeng. 2017;114: 1301-1309. © 2017 Wiley Periodicals, Inc.© 2017 Wiley Periodicals, Inc.


July 19, 2019

Multiple independent changes in mitochondrial genome conformation in chlamydomonadalean algae

Chlamydomonadalean green algae are no stranger to linear mitochondrial genomes, particularly members of the Reinhardtinia clade. At least nine different Reinhardtinia species are known to have linear mitochondrial DNAs (mtDNAs), including the model species Chlamydomonas reinhardtii. Thus, it is no surprise that some have suggested that the most recent common ancestor of the Reinhardtinia clade had a linear mtDNA. But the recent uncovering of circular-mapping mtDNAs in a range of Reinhardtinia algae, such as Volvox carteri and Tetrabaena socialis, has shed doubt on this hypothesis. Here, we explore mtDNA sequence and structure within the colonial Reinhardtinia algae Yamagishiella unicocca and Eudorina sp. NIES-3984, which occupy phylogenetically intermediate positions between species with opposing mtDNA mapping structures. Sequencing and gel electrophoresis data indicate that Y. unicocca has a linear monomeric mitochondrial genome with long (3?kb) palindromic telomeres. Conversely, the mtDNA of Eudorina sp., despite having an identical gene order to that of Y. unicocca, assembled as a circular-mapping molecule. Restriction digests of Eudorina sp. mtDNA supported its circular map, but also revealed a linear monomeric form with a matching architecture and gene order to the Y. unicocca mtDNA. Based on these data, we suggest that there have been at least three separate shifts in mtDNA conformation in the Reinhardtinia, and that the common ancestor of this clade had a linear monomeric mitochondrial genome with palindromic telomeres.


July 19, 2019

The complete genome sequence of the phytopathogenic fungus Sclerotinia sclerotiorum reveals insights into the genome architecture of broad host range pathogens.

Sclerotinia sclerotiorum is a phytopathogenic fungus with over 400 hosts including numerous economically important cultivated species. This contrasts many economically destructive pathogens that only exhibit a single or very few hosts. Many plant pathogens exhibit a “two-speed” genome. So described because their genomes contain alternating gene rich, repeat sparse and gene poor, repeat-rich regions. In fungi, the repeat-rich regions may be subjected to a process termed repeat-induced point mutation (RIP). Both repeat activity and RIP are thought to play a significant role in evolution of secreted virulence proteins, termed effectors. We present a complete genome sequence of S. sclerotiorum generated using Single Molecule Real-Time Sequencing technology with highly accurate annotations produced using an extensive RNA sequencing data set. We identified 70 effector candidates and have highlighted their in planta expression profiles. Furthermore, we characterized the genome architecture of S. sclerotiorum in comparison to plant pathogens that exhibit “two-speed” genomes. We show that there is a significant association between positions of secreted proteins and regions with a high RIP index in S. sclerotiorum but we did not detect a correlation between secreted protein proportion and GC content. Neither did we detect a negative correlation between CDS content and secreted protein proportion across the S. sclerotiorum genome. We conclude that S. sclerotiorum exhibits subtle signatures of enhanced mutation of secreted proteins in specific genomic compartments as a result of transposition and RIP activity. However, these signatures are not observable at the whole-genome scale.


July 19, 2019

Re-sequencing transgenic plants revealed rearrangements at T-DNA inserts, and integration of a short T-DNA fragment, but no increase of small mutations elsewhere.

Transformation resulted in deletions and translocations at T-DNA inserts, but not in genome-wide small mutations. A tiny T-DNA splinter was detected that probably would remain undetected by conventional techniques. We investigated to which extent Agrobacterium tumefaciens-mediated transformation is mutagenic, on top of inserting T-DNA. To prevent mutations due to in vitro propagation, we applied floral dip transformation of Arabidopsis thaliana. We re-sequenced the genomes of five primary transformants, and compared these to genomic sequences derived from a pool of four wild-type plants. By genome-wide comparisons, we identified ten small mutations in the genomes of the five transgenic plants, not correlated to the positions or number of T-DNA inserts. This mutation frequency is within the range of spontaneous mutations occurring during seed propagation in A. thaliana, as determined earlier. In addition, we detected small as well as large deletions specifically at the T-DNA insert sites. Furthermore, we detected partial T-DNA inserts, one of these a tiny 50-bp fragment originating from a central part of the T-DNA construct used, inserted into the plant genome without flanking other T-DNA. Because of its small size, we named this fragment a T-DNA splinter. As far as we know this is the first report of such a small T-DNA fragment insert in absence of any T-DNA border sequence. Finally, we found evidence for translocations from other chromosomes, flanking T-DNA inserts. In this study, we showed that next-generation sequencing (NGS) is a highly sensitive approach to detect T-DNA inserts in transgenic plants.


July 19, 2019

First report of two complete Clostridium chauvoei genome sequences and detailed in silico genome analysis.

Clostridium (C.) chauvoei is a Gram-positive, spore forming, anaerobic bacterium. It causes black leg in ruminants, a typically fatal histotoxic myonecrosis. High quality circular genome sequences were generated for the C. chauvoei type strain DSM 7528(T) (ATCC 10092(T)) and a field strain 12S0467 isolated in Germany. The origin of replication (oriC) was comparable to that of Bacillus subtilis in structure with two regions containing DnaA boxes. Similar prophages were identified in the genomes of both C. chauvoei strains which also harbored hemolysin and bacterial spore formation genes. A CRISPR type I-B system with limited variations in the repeat number was identified. Sporulation and germination process related genes were homologous to that of the Clostridia cluster I group but novel variations for regulatory genes were identified indicative for strain specific control of regulatory events. Phylogenomics showed a higher relatedness to C. septicum than to other so far sequenced genomes of species belonging to the genus Clostridium. Comparative genome analysis of three C. chauvoei circular genome sequences revealed the presence of few inversions and translocations in locally collinear blocks (LCBs). The species genome also shows a large number of genes involved in proteolysis, genes for glycosyl hydrolases and metal iron transportation genes which are presumably involved in virulence and survival in the host. Three conserved flagellar genes (fliC) were identified in each of the circular genomes. In conclusion this is the first comparative analysis of circular genomes for the species C. chauvoei, enabling insights into genome composition and virulence factor variation. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.


July 19, 2019

A case study into microbial genome assembly gap sequences and finishing strategies.

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.


July 19, 2019

PacBio but not Illumina technology can achieve fast, accurate and complete closure of the high GC, complex Burkholderia pseudomallei two-chromosome genome

Although PacBio third-generation sequencers have improved the read lengths of genome sequencing which facilitates the assembly of complete genomes, no study has reported success in using PacBio data alone to completely sequence a two-chromosome bacterial genome from a single library in a single run. Previous studies using earlier versions of sequencing chemistries have at most been able to finish bacterial genomes containing only one chromosome with de novo assembly. In this study, we compared the robustness of PacBio RS II, using one SMRT cell and the latest P6-C4 chemistry, with Illumina HiSeq 1500 in sequencing the genome of Burkholderia pseudomallei, a bacterium which contains two large circular chromosomes, very high G+C content of 68–69%, highly repetitive regions and substantial genomic diversity, and represents one of the largest and most complex bacterial genomes sequenced, using a reference genome generated by hybrid assembly using PacBio and Illumina datasets with subsequent manual validation. Results showed that PacBio data with de novo assembly, but not Illumina, was able to completely sequence the B. pseudomallei genome without any gaps or mis-assemblies. The two large contigs of the PacBio assembly aligned unambiguously to the reference genome, sharing >99.9% nucleotide identities. Conversely, Illumina data assembled using three different assemblers resulted in fragmented assemblies (201–366 contigs), sharing only 92.2–100% and 92.0–100% nucleotide identities to chromosomes I and II reference sequences, respectively, with no indication that the B. pseudomallei genome consisted of two chromosomes with four copies of ribosomal operons. Among all assemblies, the PacBio assembly recovered the highest number of core and virulence proteins, and housekeeping genes based on whole-genome multilocus sequence typing (wgMLST). Most notably, assembly solely based on PacBio outperformed even hybrid assembly using both PacBio and Illumina datasets. Hybrid approach generated only 74 contigs, while the PacBio data alone with de novo assembly achieved complete closure of the two-chromosome B. pseudomallei genome without additional costly bench work and further sequencing. PacBio RS II using P6-C4 chemistry is highly robust and cost-effective and should be the platform of choice in sequencing bacterial genomes, particularly for those that are well-known to be difficult-to-sequence.


July 19, 2019

A new method for sequencing the hypervariable Plasmodium falciparum gene var2csa from clinical samples.

VAR2CSA, a member of the Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1) family, mediates the binding of P. falciparum-infected erythrocytes to chondroitin sulfate A, a surface-associated molecule expressed in placental cells, and plays a central role in the pathogenesis of placental malaria. VAR2CSA is a target of naturally acquired immunity and, as such, is a leading vaccine candidate against placental malaria. This protein is very polymorphic and technically challenging to sequence. Published var2csa sequences, mostly limited to specific domains, have been generated through the sequencing of cloned PCR amplicons using capillary electrophoresis, a method that is both time consuming and costly, and that performs poorly when applied to clinical samples that are commonly polyclonal. A next-generation sequencing platform, Pacific Biosciences (PacBio), offers an alternative approach to overcome these issues.PCR primers were designed that target a 5 kb segment in the 5′ end of var2csa and the resulting amplicons were sequenced using PacBio sequencing. The primers were optimized using two laboratory strains and were validated on DNA from 43 clinical samples, extracted from dried blood spots on filter paper or from cryopreserved P. falciparum-infected erythrocytes. Sequence reads were assembled using the SMRT-analysis ConsensusTools module.Here, a PacBio sequencing-based approach for recovering a segment encoding the majority of VAR2CSA’s extracellular region is described; this segment includes the totality of the first four domains in the 5′ end of var2csa (~5 kb), from clinical malaria samples. The feasibility of the method is demonstrated, showing a high success rate from cryopreserved samples and more limited success from dried blood spots stored at room temperature, and characterized the genetic variation of the var2csa locus.This method will facilitate a detailed analysis of var2csa genetic variation and can be adapted to sequence other hypervariable P. falciparum genes.


July 19, 2019

A mobile pathogenicity chromosome in Fusarium oxysporum for infection of multiple cucurbit species.

The genome of Fusarium oxysporum (Fo) consists of a set of eleven ‘core’ chromosomes, shared by most strains and responsible for housekeeping, and one or several accessory chromosomes. We sequenced a strain of Fo f.sp. radicis-cucumerinum (Forc) using PacBio SMRT sequencing. All but one of the core chromosomes were assembled into single contigs, and a chromosome that shows all the hallmarks of a pathogenicity chromosome comprised two contigs. A central part of this chromosome contains all identified candidate effector genes, including homologs of SIX6, SIX9, SIX11 and SIX 13. We show that SIX6 contributes to virulence of Forc. Through horizontal chromosome transfer (HCT) to a non-pathogenic strain, we also show that the accessory chromosome containing the SIX gene homologs is indeed a pathogenicity chromosome for cucurbit infection. Conversely, complete loss of virulence was observed in Forc016 strains that lost this chromosome. We conclude that also a non-wilt-inducing Fo pathogen relies on effector proteins for successful infection and that the Forc pathogenicity chromosome contains all the information necessary for causing root rot of cucurbits. Three out of nine HCT strains investigated have undergone large-scale chromosome alterations, reflecting the remarkable plasticity of Fo genomes.


July 19, 2019

Insight into the recent genome duplication of the halophilic yeast Hortaea werneckii: combining an improved genome with gene expression and chromatin structure.

Extremophilic organisms demonstrate the flexibility and adaptability of basic biological processes by highlighting how cell physiology adapts to environmental extremes. Few eukaryotic extremophiles have been well studied and only a small number are amenable to laboratory cultivation and manipulation. A detailed characterization of the genome architecture of such organisms is important to illuminate how they adapt to environmental stresses. One excellent example of a fungal extremophile is the halophile Hortaea werneckii (Pezizomycotina, Dothideomycetes, Capnodiales), a yeast-like fungus able to thrive at near-saturating concentrations of sodium chloride and which is also tolerant to both UV irradiation and desiccation. Given its unique lifestyle and its remarkably recent whole genome duplication, H. werneckii provides opportunities for testing the role of genome duplications and adaptability to extreme environments. We previously assembled the genome of H. werneckii using short-read sequencing technology and found a remarkable degree of gene duplication. Technology limitations, however, precluded high-confidence annotation of the entire genome. We therefore revisited the H. wernickii genome using long-read, single-molecule sequencing and provide an improved genome assembly which, combined with transcriptome and nucleosome analysis, provides a useful resource for fungal halophile genomics. Remarkably, the ~50 Mb H. wernickii genome contains 15,974 genes of which 95% (7608) are duplicates formed by a recent whole genome duplication (WGD), with an average of 5% protein sequence divergence between them. We found that the WGD is extraordinarily recent, and compared to Saccharomyces cerevisiae, the majority of the genome’s ohnologs have not diverged at the level of gene expression of chromatin structure. Copyright © 2017 Sinha et al.


July 19, 2019

The draft genome of Globodera ellingtonae.

Globodera ellingtonae is a newly described potato cyst nematode (PCN) found in Idaho, Oregon, and Argentina. Here, we present a genome assembly for G. ellingtonae, a relative of the quarantine nematodes G. pallida and G. rostochiensis, produced using data from Illumina and Pacific Biosciences DNA sequencing technologies.


July 19, 2019

Amplification-free, CRISPR-Cas9 targeted enrichment and SMRT Sequencing of repeat-expansion disease causative genomic regions

Targeted sequencing has proven to be an economical means of obtaining sequence information for one or more defined regions of a larger genome. However, most target enrichment methods require amplification. Some genomic regions, such as those with extreme GC content and repetitive sequences, are recalcitrant to faithful amplification. Yet, many human genetic disorders are caused by repeat expansions, including difficult to sequence tandem repeats. We have developed a novel, amplification-free enrichment technique that employs the CRISPR-Cas9 system for specific targeting multiple genomic loci. This method, in conjunction with long reads generated through Single Molecule, Real-Time (SMRT) sequencing and unbiased coverage, enables enrichment and sequencing of complex genomic regions that cannot be investigated with other technologies. Using human genomic DNA samples, we demonstrate successful targeting of causative loci for Huntingtontextquoterights disease (HTT; CAG repeat), Fragile X syndrome (FMR1; CGG repeat), amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (C9orf72; GGGGCC repeat), and spinocerebellar ataxia type 10 (SCA10) (ATXN10; variable ATTCT repeat). The method, amenable to multiplexing across multiple genomic loci, uses an amplification-free approach that facilitates the isolation of hundreds of individual on-target molecules in a single SMRT Cell and accurate sequencing through long repeat stretches, regardless of extreme GC percent or sequence complexity content. Our novel targeted sequencing method opens new doors to genomic analyses independent of PCR amplification that will facilitate the study of repeat expansion disorders.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.