Menu
July 19, 2019

A new method for sequencing the hypervariable Plasmodium falciparum gene var2csa from clinical samples.

VAR2CSA, a member of the Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1) family, mediates the binding of P. falciparum-infected erythrocytes to chondroitin sulfate A, a surface-associated molecule expressed in placental cells, and plays a central role in the pathogenesis of placental malaria. VAR2CSA is a target of naturally acquired immunity and, as such, is a leading vaccine candidate against placental malaria. This protein is very polymorphic and technically challenging to sequence. Published var2csa sequences, mostly limited to specific domains, have been generated through the sequencing of cloned PCR amplicons using capillary electrophoresis, a method that is both time consuming and costly, and that performs poorly when applied to clinical samples that are commonly polyclonal. A next-generation sequencing platform, Pacific Biosciences (PacBio), offers an alternative approach to overcome these issues.PCR primers were designed that target a 5 kb segment in the 5′ end of var2csa and the resulting amplicons were sequenced using PacBio sequencing. The primers were optimized using two laboratory strains and were validated on DNA from 43 clinical samples, extracted from dried blood spots on filter paper or from cryopreserved P. falciparum-infected erythrocytes. Sequence reads were assembled using the SMRT-analysis ConsensusTools module.Here, a PacBio sequencing-based approach for recovering a segment encoding the majority of VAR2CSA’s extracellular region is described; this segment includes the totality of the first four domains in the 5′ end of var2csa (~5 kb), from clinical malaria samples. The feasibility of the method is demonstrated, showing a high success rate from cryopreserved samples and more limited success from dried blood spots stored at room temperature, and characterized the genetic variation of the var2csa locus.This method will facilitate a detailed analysis of var2csa genetic variation and can be adapted to sequence other hypervariable P. falciparum genes.


July 19, 2019

A mobile pathogenicity chromosome in Fusarium oxysporum for infection of multiple cucurbit species.

The genome of Fusarium oxysporum (Fo) consists of a set of eleven ‘core’ chromosomes, shared by most strains and responsible for housekeeping, and one or several accessory chromosomes. We sequenced a strain of Fo f.sp. radicis-cucumerinum (Forc) using PacBio SMRT sequencing. All but one of the core chromosomes were assembled into single contigs, and a chromosome that shows all the hallmarks of a pathogenicity chromosome comprised two contigs. A central part of this chromosome contains all identified candidate effector genes, including homologs of SIX6, SIX9, SIX11 and SIX 13. We show that SIX6 contributes to virulence of Forc. Through horizontal chromosome transfer (HCT) to a non-pathogenic strain, we also show that the accessory chromosome containing the SIX gene homologs is indeed a pathogenicity chromosome for cucurbit infection. Conversely, complete loss of virulence was observed in Forc016 strains that lost this chromosome. We conclude that also a non-wilt-inducing Fo pathogen relies on effector proteins for successful infection and that the Forc pathogenicity chromosome contains all the information necessary for causing root rot of cucurbits. Three out of nine HCT strains investigated have undergone large-scale chromosome alterations, reflecting the remarkable plasticity of Fo genomes.


July 19, 2019

The draft genome of tropical fruit durian (Durio zibethinus).

Durian (Durio zibethinus) is a Southeast Asian tropical plant known for its hefty, spine-covered fruit and sulfury and onion-like odor. Here we present a draft genome assembly of D. zibethinus, representing the third plant genus in the Malvales order and first in the Helicteroideae subfamily to be sequenced. Single-molecule sequencing and chromosome contact maps enabled assembly of the highly heterozygous durian genome at chromosome-scale resolution. Transcriptomic analysis showed upregulation of sulfur-, ethylene-, and lipid-related pathways in durian fruits. We observed paleopolyploidization events shared by durian and cotton and durian-specific gene expansions in MGL (methionine ?-lyase), associated with production of volatile sulfur compounds (VSCs). MGL and the ethylene-related gene ACS (aminocyclopropane-1-carboxylic acid synthase) were upregulated in fruits concomitantly with their downstream metabolites (VSCs and ethylene), suggesting a potential association between ethylene biosynthesis and methionine regeneration via the Yang cycle. The durian genome provides a resource for tropical fruit biology and agronomy.


July 19, 2019

The composite 259-kb plasmid of Martelella mediterranea DSM 17316(T)-a natural replicon with functional RepABC modules from Rhodobacteraceae and Rhizobiaceae.

A multipartite genome organization with a chromosome and many extrachromosomal replicons (ECRs) is characteristic for Alphaproteobacteria. The best investigated ECRs of terrestrial rhizobia are the symbiotic plasmids for legume root nodulation and the tumor-inducing (Ti) plasmid of Agrobacterium tumefaciens. RepABC plasmids represent the most abundant alphaproteobacterial replicon type. The currently known homologous replication modules of rhizobia and Rhodobacteraceae are phylogenetically distinct. In this study, we surveyed type-strain genomes from the One Thousand Microbial Genomes (KMG-I) project and identified a roseobacter-specific RepABC-type operon in the draft genome of the marine rhizobium Martelella mediterranea DSM 17316(T). PacBio genome sequencing demonstrated the presence of three circular ECRs with sizes of 593, 259, and 170-kb. The rhodobacteral RepABC module is located together with a rhizobial equivalent on the intermediate sized plasmid pMM259, which likely originated in the fusion of a pre-existing rhizobial ECR with a conjugated roseobacter plasmid. Further evidence for horizontal gene transfer (HGT) is given by the presence of a roseobacter-specific type IV secretion system on the 259-kb plasmid and the rhodobacteracean origin of 62% of the genes on this plasmid. Functionality tests documented that the genuine rhizobial RepABC module from the Martelella 259-kb plasmid is only maintained in A. tumefaciens C58 (Rhizobiaceae) but not in Phaeobacter inhibens DSM 17395 (Rhodobacteraceae). Unexpectedly, the roseobacter-like replication system is functional and stably maintained in both host strains, thus providing evidence for a broader host range than previously proposed. In conclusion, pMM259 is the first example of a natural plasmid that likely mediates genetic exchange between roseobacters and rhizobia.


July 19, 2019

De novo PacBio long-read and phased avian genome assemblies correct and add to reference genes generated with intermediate and short reads.

Reference-quality genomes are expected to provide a resource for studying gene structure, function, and evolution. However, often genes of interest are not completely or accurately assembled, leading to unknown errors in analyses or additional cloning efforts for the correct sequences. A promising solution is long-read sequencing. Here we tested PacBio-based long-read sequencing and diploid assembly for potential improvements to the Sanger-based intermediate-read zebra finch reference and Illumina-based short-read Anna’s hummingbird reference, 2 vocal learning avian species widely studied in neuroscience and genomics. With DNA of the same individuals used to generate the reference genomes, we generated diploid assemblies with the FALCON-Unzip assembler, resulting in contigs with no gaps in the megabase range, representing 150-fold and 200-fold improvements over the current zebra finch and hummingbird references, respectively. These long-read and phased assemblies corrected and resolved what we discovered to be numerous misassemblies in the references, including missing sequences in gaps, erroneous sequences flanking gaps, base call errors in difficult-to-sequence regions, complex repeat structure errors, and allelic differences between the 2 haplotypes. These improvements were validated by single long-genome and transcriptome reads and resulted for the first time in completely resolved protein-coding genes widely studied in neuroscience and specialized in vocal learning species. These findings demonstrate the impact of long reads, sequencing of previously difficult-to-sequence regions, and phasing of haplotypes on generating the high-quality assemblies necessary for understanding gene structure, function, and evolution.© The Authors 2017. Published by Oxford University Press.


July 19, 2019

The diversity, structure, and function of heritable adaptive immunity sequences in the Aedes aegypti genome.

The Aedes aegypti mosquito transmits arboviruses, including dengue, chikungunya, and Zika virus. Understanding the mechanisms underlying mosquito immunity could provide new tools to control arbovirus spread. Insects exploit two different RNAi pathways to combat viral and transposon infection: short interfering RNAs (siRNAs) and PIWI-interacting RNAs (piRNAs) [1, 2]. Endogenous viral elements (EVEs) are sequences from non-retroviral viruses that are inserted into the mosquito genome and can act as templates for the production of piRNAs [3, 4]. EVEs therefore represent a record of past infections and a reservoir of potential immune memory [5]. The large-scale organization of EVEs has been difficult to resolve with short-read sequencing because they tend to integrate into repetitive regions of the genome. To define the diversity, organization, and function of EVEs, we took advantage of the contiguity associated with long-read sequencing to generate a high-quality assembly of the Ae. aegypti-derived Aag2 cell line genome, an important and widely used model system. We show EVEs are acquired through recombination with specific classes of long terminal repeat (LTR) retrotransposons and organize into large loci (>50 kbp) characterized by high LTR density. These EVE-containing loci have increased density of piRNAs compared to similar regions without EVEs. Furthermore, we detected EVE-derived piRNAs consistent with a targeted processing of persistently infecting virus genomes. We propose that comparisons of EVEs across mosquito populations may explain differences in vector competence, and further study of the structure and function of these elements in the genome of mosquitoes may lead to epidemiological interventions. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 19, 2019

Analysis of recombinational switching at the antigenic variation locus of the Lyme spirochete using a novel PacBio sequencing pipeline.

The Lyme disease spirochete evades the host immune system by combinatorial variation of VlsE, a surface antigen. Antigenic variation occurs via segmental gene conversion from contiguous silent cassettes into the vlsE locus. Because of the high degree of similarity between switch variants and the size of vlsE, short-read NGS technologies have been unsuitable for sequencing vlsE populations. Here we use PacBio sequencing technology coupled with the first fully-automated software pipeline (VAST) to accurately process NGS data by minimizing error frequency, eliminating heteroduplex errors and accurately aligning switch variants. We extend earlier studies by showing use of almost all of the vlsE SNP repertoire. In different tissues of the same mouse, 99.6% of the variants were unique, suggesting that dissemination of Borrelia burgdorferi is predominantly unidirectional with little tissue-to-tissue hematogenous dissemination. We also observed a similar number of variants in SCID and wild-type mice, a heatmap of location and frequency of amino acid changes on the 3D structure and note differences observed in SCID versus wild type mice that hint at possible amino acid function. Our observed selection against diversification of residues at the dimer interface in wild-type mice strongly suggests that dimerization is required for in vivo functionality of vlsE.© 2017 John Wiley & Sons Ltd.


July 19, 2019

Long-read genome sequence assembly provides insight into ongoing retroviral invasion of the koala germline.

The koala retrovirus (KoRV) is implicated in several diseases affecting the koala (Phascolarctos cinereus). KoRV provirus can be present in the genome of koalas as an endogenous retrovirus (present in all cells via germline integration) or as exogenous retrovirus responsible for somatic integrations of proviral KoRV (present in a limited number of cells). This ongoing invasion of the koala germline by KoRV provides a powerful opportunity to assess the viral strategies used by KoRV in an individual. Analysis of a high-quality genome sequence of a single koala revealed 133 KoRV integration sites. Most integrations contain full-length, endogenous provirus; KoRV-A subtype. The second most frequent integrations contain an endogenous recombinant element (recKoRV) in which most of the KoRV protein-coding region has been replaced with an ancient, endogenous retroelement. A third set of integrations, with very low sequence coverage, may represent somatic cell integrations of KoRV-A, KoRV-B and two recently designated additional subgroups, KoRV-D and KoRV-E. KoRV-D and KoRV-E are missing several genes required for viral processing, suggesting they have been transmitted as defective viruses. Our results represent the first comprehensive analyses of KoRV integration and variation in a single animal and provide further insights into the process of retroviral-host species interactions.


July 19, 2019

Comparative genomic analyses of Clavibacter michiganensis subsp. insidiosus and pathogenicity on Medicago truncatula.

Clavibacter michiganensis is the most economically important gram-positive bacterial plant pathogen with subspecies that cause serious diseases of maize, wheat, tomato, potato, and alfalfa. Much less is known about pathogenesis involving gram-positive plant pathogens than is known for gram-negative bacteria. Comparative genome analyses of C. michiganensis subspecies affecting tomato, potato, and maize have provided insights on pathogenicity. In this study, we identified strains of C. michiganensis subsp. insidiosus with contrasting pathogenicity on three accessions of the model legume Medicago truncatula. We generated complete genome sequences for two strains and compared these to a previously sequenced strain and genome sequences of four other subspecies. The three C. michiganensis subsp. insidiosus strains varied in gene content due to genome rearrangements, most likely facilitated by insertion elements, and plasmid number, which varied from one to three depending on strain. The core C. michiganensis genome consisted of 1,930 genes, with 401 genes unique to C. michiganensis subsp. insidiosus. An operon for synthesis of the extracellular blue pigment indigoidine, enzymes for pectin degradation, and an operon for inositol metabolism are among the unique features. Secreted serine proteases belonging to both the pat-1 and ppa families were present but highly diverged from those in other subspecies.


July 19, 2019

Single molecule real-time (SMRT®) DNA sequencing of HLA genes at ultra-high resolution from 126 International HLA and Immunogenetics Workshop cell lines.

The hyperpolymorphic HLA genes play important roles in disease and transplantation and act as genetic markers of migration and evolution. A panel of 107 B-lymphoblastoid cell lines (B-LCLs) was established in 1987 at the 10th International Histocompatibility Workshop as a resource for the immunogenetics community. These B-LCLs are well characterised and represent diverse ethnicities and HLA haplotypes. Here we have applied Pacific Biosciences’ Single Molecule Real-Time (SMRT) DNA sequencing to HLA type 126 B-LCL, including the 107 IHIW cells, to ultra-high resolution. Amplicon sequencing of full-length HLA class I genes (HLA-A, -B and -C) and partial length HLA class II genes (HLA-DRB1, -DQB1 and -DPB1) was performed. We typed a total of 931 HLA alleles, 895 (96%) of which were consistent with the typing in the IPD-IMGT/HLA Database (Release 3.27.0, 2017-01-20), with 595 (64%) typed at a higher resolution. Discrepant types, including novel alleles (n=10) and changes in zygosity (n=13), as well as previously unreported types (n=34) were observed. In addition, patterns of linkage disequilibrium were distinguished by four-field resolution typing of HLA-B and HLA-C. By improving and standardising the HLA typing of these B-LCLs, we have ensured their continued usefulness as a resource for the immunogenetics community in the age of next generation DNA sequencing.This article is protected by copyright. All rights reserved.


July 19, 2019

Centromere evolution and CpG methylation during vertebrate speciation.

Centromeres and large-scale structural variants evolve and contribute to genome diversity during vertebrate speciation. Here, we perform de novo long-read genome assembly of three inbred medaka strains that are derived from geographically isolated subpopulations and undergo speciation. Using single-molecule real-time (SMRT) sequencing, we obtain three chromosome-mapped genomes of length ~734, ~678, and ~744Mbp with a resource of twenty-two centromeric regions of length 20-345kbp. Centromeres are positionally conserved among the three strains and even between four pairs of chromosomes that were duplicated by the teleost-specific whole-genome duplication 320-350 million years ago. The centromeres do not all evolve at a similar pace; rather, centromeric monomers in non-acrocentric chromosomes evolve significantly faster than those in acrocentric chromosomes. Using methylation sensitive SMRT reads, we uncover centromeres are mostly hypermethylated but have hypomethylated sub-regions that acquire unique sequence compositions independently. These findings reveal the potential of non-acrocentric centromere evolution to contribute to speciation.


July 19, 2019

Pacific Biosciences sequencing and IMGT/HighV-QUEST analysis of full-length single chain fragment variable from an in vivo selected phage-display combinatorial Library.

Phage-display selection of immunoglobulin (IG) or antibody single chain Fragment variable (scFv) from combinatorial libraries is widely used for identifying new antibodies for novel targets. Next-generation sequencing (NGS) has recently emerged as a new method for the high throughput characterization of IG and T cell receptor (TR) immune repertoires bothin vivoandin vitro. However, challenges remain for the NGS sequencing of scFv from combinatorial libraries owing to the scFv length (>800?bp) and the presence of two variable domains [variable heavy (VH) and variable light (VL) for IG] associated by a peptide linker in a single chain. Here, we show that single-molecule real-time (SMRT) sequencing with the Pacific Biosciences RS II platform allows for the generation of full-length scFv reads obtained from anin vivoselection of scFv-phages in an animal model of atherosclerosis. We first amplified the DNA of the phagemid inserts from scFv-phages eluted from an aortic section at the third round of thein vivoselection. From this amplified DNA, 450,558 reads were obtained from 15 SMRT cells. Highly accurate circular consensus sequences from these reads were generated, filtered by quality and then analyzed by IMGT/HighV-QUEST with the functionality for scFv. Full-length scFv were identified and characterized in 348,659 reads. Full-length scFv sequencing is an absolute requirement for analyzing the associated VH and VL domains enriched during thein vivopanning rounds. In order to further validate the ability of SMRT sequencing to provide high quality, full-length scFv sequences, we tracked the reads of an scFv-phage clone P3 previously identified by biological assays and Sanger sequencing. Sixty P3 reads showed 100% identity with the full-length scFv of 767?bp, 53 of them covering the whole insert of 977?bp, which encompassed the primer sequences. The remaining seven reads were identical over a shortened length of 939?bp that excludes the vicinity of primers at both ends. Interestingly these reads were obtained from each of the 15 SMRT cells. Thus, the SMRT sequencing method and the IMGT/HighV-QUEST functionality for scFv provides a straightforward protocol for characterization of full-length scFv from combinatorial phage libraries.


July 19, 2019

Cytogenomic identification and long-read single molecule real-time (SMRT) sequencing of a Bardet-Biedl Syndrome 9 (BBS9) deletion.

Bardet-Biedl syndrome (BBS) is a recessive disorder characterized by heterogeneous clinical manifestations, including truncal obesity, rod-cone dystrophy, renal anomalies, postaxial polydactyly, and variable developmental delays. At least 20 genes have been implicated in BBS, and all are involved in primary cilia function. We report a 1-year-old male child from Guyana with obesity, postaxial polydactyly on his right foot, hypotonia, ophthalmologic abnormalities, and developmental delay, which together indicated a clinical diagnosis of BBS. Clinical chromosomal microarray (CMA) testing and high-throughput BBS gene panel sequencing detected a homozygous 7p14.3 deletion of exons 1-4 of BBS9 that was encompassed by a 17.5?Mb region of homozygosity at chromosome 7p14.2-p21.1. The precise breakpoints of the deletion were delineated to a 72.8?kb region in the proband and carrier parents by third-generation long-read single molecule real-time (SMRT) sequencing (Pacific Biosciences), which suggested non-homologous end joining as a likely mechanism of formation. Long-read SMRT sequencing of the deletion breakpoints also determined that the aberration included the neighboring RP9 gene implicated in retinitis pigmentosa; however, the clinical significance of this was considered uncertain given the paucity of reported cases with unambiguous RP9 mutations. Taken together, our study characterized a BBS9 deletion, and the identification of this shared haplotype in the parents suggests that this pathogenic aberration may be a BBS founder mutation in the Guyanese population. Importantly, this informative case also highlights the utility of long-read SMRT sequencing to map nucleotide breakpoints of clinically relevant structural variants.


July 19, 2019

Sensitive detection of mitochondrial DNA variants for analysis of mitochondrial DNA-enriched extracts from frozen tumor tissue.

Large variation exists in mitochondrial DNA (mtDNA) not only between but also within individuals. Also in human cancer, tumor-specific mtDNA variation exists. In this work, we describe the comparison of four methods to extract mtDNA as pure as possible from frozen tumor tissue. Also, three state-of-the-art methods for sensitive detection of mtDNA variants were evaluated. The main aim was to develop a procedure to detect low-frequent single-nucleotide mtDNA-specific variants in frozen tumor tissue. We show that of the methods evaluated, DNA extracted from cytosol fractions following exonuclease treatment results in highest mtDNA yield and purity from frozen tumor tissue (270-fold mtDNA enrichment). Next, we demonstrate the sensitivity of detection of low-frequent single-nucleotide mtDNA variants (=1% allele frequency) in breast cancer cell lines MDA-MB-231 and MCF-7 by single-molecule real-time (SMRT) sequencing, UltraSEEK chemistry based mass spectrometry, and digital PCR. We also show de novo detection and allelic phasing of variants by SMRT sequencing. We conclude that our sensitive procedure to detect low-frequent single-nucleotide mtDNA variants from frozen tumor tissue is based on extraction of DNA from cytosol fractions followed by exonuclease treatment to obtain high mtDNA purity, and subsequent SMRT sequencing for (de novo) detection and allelic phasing of variants.


July 19, 2019

Genomic analysis of hospital plumbing reveals diverse reservoir of bacterial plasmids conferring carbapenem resistance.

The hospital environment is a potential reservoir of bacteria with plasmids conferring carbapenem resistance. Our Hospital Epidemiology Service routinely performs extensive sampling of high-touch surfaces, sinks, and other locations in the hospital. Over a 2-year period, additional sampling was conducted at a broader range of locations, including housekeeping closets, wastewater from hospital internal pipes, and external manholes. We compared these data with previously collected information from 5 years of patient clinical and surveillance isolates. Whole-genome sequencing and analysis of 108 isolates provided comprehensive characterization ofblaKPC/blaNDM-positive isolates, enabling an in-depth genetic comparison. Strikingly, despite a very low prevalence of patient infections withblaKPC-positive organisms, all samples from the intensive care unit pipe wastewater and external manholes contained carbapenemase-producing organisms (CPOs), suggesting a vast, resilient reservoir. We observed a diverse set of species and plasmids, and we noted species and susceptibility profile differences between environmental and patient populations of CPOs. However, there were plasmid backbones common to both populations, highlighting a potential environmental reservoir of mobile elements that may contribute to the spread of resistance genes. Clear associations between patient and environmental isolates were uncommon based on sequence analysis and epidemiology, suggesting reasonable infection control compliance at our institution. Nonetheless, a probable nosocomial transmission ofLeclerciasp. from the housekeeping environment to a patient was detected by this extensive surveillance. These data and analyses further our understanding of CPOs in the hospital environment and are broadly relevant to the design of infection control strategies in many infrastructure settings.IMPORTANCECarbapenemase-producing organisms (CPOs) are a global concern because of the morbidity and mortality associated with these resistant Gram-negative bacteria. Horizontal plasmid transfer spreads the resistance mechanism to new bacteria, and understanding the plasmid ecology of the hospital environment can assist in the design of control strategies to prevent nosocomial infections. A 5-year genomic and epidemiological survey was undertaken to study the CPOs in the patient-accessible environment, as well as in the plumbing system removed from the patient. This comprehensive survey revealed a vast, unappreciated reservoir of CPOs in wastewater, which was in contrast to the low positivity rate in both the patient population and the patient-accessible environment. While there were few patient-environmental isolate associations, there were plasmid backbones common to both populations. These results are relevant to all hospitals for which CPO colonization may not yet be defined through extensive surveillance.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.