Menu
July 19, 2019

An improved Plasmodium cynomolgi genome assembly reveals an unexpected methyltransferase gene expansion.

Plasmodium cynomolgi, a non-human primate malaria parasite species, has been an important model parasite since its discovery in 1907. Similarities in the biology of P. cynomolgi to the closely related, but less tractable, human malaria parasite P. vivax make it the model parasite of choice for liver biology and vaccine studies pertinent to P. vivax malaria. Molecular and genome-scale studies of P. cynomolgi have relied on the current reference genome sequence, which remains highly fragmented with 1,649 unassigned scaffolds and little representation of the subtelomeres.  Methods: Using long-read sequence data (Pacific Biosciences SMRT technology), we assembled and annotated a new reference genome sequence, PcyM, sourced from an Indian rhesus monkey. We compare the newly assembled genome sequence with those of several other Plasmodium species, including a re-annotated P. coatneyi assembly.The new PcyM genome assembly is of significantly higher quality than the existing reference, comprising only 56 pieces, no gaps and an improved average gene length. Detailed manual curation has ensured a comprehensive annotation of the genome with 6,632 genes, nearly 1,000 more than previously attributed to P. cynomolgi. The new assembly also has an improved representation of the subtelomeric regions, which account for nearly 40% of the sequence. Within the subtelomeres, we identified more than 1300 Plasmodium interspersed repeat ( pir) genes, as well as a striking expansion of 36 methyltransferase pseudogenes that originated from a single copy on chromosome 9.The manually curated PcyM reference genome sequence is an important new resource for the malaria research community. The high quality and contiguity of the data have enabled the discovery of a novel expansion of methyltransferase in the subtelomeres, and illustrates the new comparative genomics capabilities that are being unlocked by complete reference genomes.


July 19, 2019

Discovery and biosynthesis of gladiolin: A Burkholderia gladioli antibiotic with promising activity against Mycobacterium tuberculosis.

An antimicrobial activity screen of Burkholderia gladioli BCC0238, a clinical isolate from a cystic fibrosis patient, led to the discovery of gladiolin, a novel macrolide antibiotic with potent activity against Mycobacterium tuberculosis H37Rv. Gladiolin is structurally related to etnangien, a highly unstable antibiotic from Sorangium cellulosum that is also active against Mycobacteria. Like etnangien, gladiolin was found to inhibit RNA polymerase, a validated drug target in M. tuberculosis. However, gladiolin lacks the highly labile hexaene moiety of etnangien and was thus found to possess significantly increased chemical stability. Moreover, gladiolin displayed low mammalian cytotoxicity and good activity against several M. tuberculosis clinical isolates, including four that are resistant to isoniazid and one that is resistant to both isoniazid and rifampicin. Overall, these data suggest that gladiolin may represent a useful starting point for the development of novel drugs to tackle multidrug-resistant tuberculosis. The B. gladioli BCC0238 genome was sequenced using Single Molecule Real Time (SMRT) technology. This resulted in four contiguous sequences: two large circular chromosomes and two smaller putative plasmids. Analysis of the chromosome sequences identified 49 putative specialized metabolite biosynthetic gene clusters. One such gene cluster, located on the smaller of the two chromosomes, encodes a trans-acyltransferase (trans-AT) polyketide synthase (PKS) multienzyme that was hypothesized to assemble gladiolin. Insertional inactivation of a gene in this cluster encoding one of the PKS subunits abrogated gladiolin production, confirming that the gene cluster is responsible for biosynthesis of the antibiotic. Comparison of the PKSs responsible for the assembly of gladiolin and etnangien showed that they possess a remarkably similar architecture, obfuscating the biosynthetic mechanisms responsible for most of the structural differences between the two metabolites.


July 19, 2019

PacBio but not Illumina technology can achieve fast, accurate and complete closure of the high GC, complex Burkholderia pseudomallei two-chromosome genome

Although PacBio third-generation sequencers have improved the read lengths of genome sequencing which facilitates the assembly of complete genomes, no study has reported success in using PacBio data alone to completely sequence a two-chromosome bacterial genome from a single library in a single run. Previous studies using earlier versions of sequencing chemistries have at most been able to finish bacterial genomes containing only one chromosome with de novo assembly. In this study, we compared the robustness of PacBio RS II, using one SMRT cell and the latest P6-C4 chemistry, with Illumina HiSeq 1500 in sequencing the genome of Burkholderia pseudomallei, a bacterium which contains two large circular chromosomes, very high G+C content of 68–69%, highly repetitive regions and substantial genomic diversity, and represents one of the largest and most complex bacterial genomes sequenced, using a reference genome generated by hybrid assembly using PacBio and Illumina datasets with subsequent manual validation. Results showed that PacBio data with de novo assembly, but not Illumina, was able to completely sequence the B. pseudomallei genome without any gaps or mis-assemblies. The two large contigs of the PacBio assembly aligned unambiguously to the reference genome, sharing >99.9% nucleotide identities. Conversely, Illumina data assembled using three different assemblers resulted in fragmented assemblies (201–366 contigs), sharing only 92.2–100% and 92.0–100% nucleotide identities to chromosomes I and II reference sequences, respectively, with no indication that the B. pseudomallei genome consisted of two chromosomes with four copies of ribosomal operons. Among all assemblies, the PacBio assembly recovered the highest number of core and virulence proteins, and housekeeping genes based on whole-genome multilocus sequence typing (wgMLST). Most notably, assembly solely based on PacBio outperformed even hybrid assembly using both PacBio and Illumina datasets. Hybrid approach generated only 74 contigs, while the PacBio data alone with de novo assembly achieved complete closure of the two-chromosome B. pseudomallei genome without additional costly bench work and further sequencing. PacBio RS II using P6-C4 chemistry is highly robust and cost-effective and should be the platform of choice in sequencing bacterial genomes, particularly for those that are well-known to be difficult-to-sequence.


July 19, 2019

Increased risk of low birth weight in women with placental malaria associated with P. falciparum VAR2CSA clade.

Pregnancy associated malaria (PAM) causes adverse pregnancy and birth outcomes owing to Plasmodium falciparum accumulation in the placenta. Placental accumulation is mediated by P. falciparum protein VAR2CSA, a leading PAM-specific vaccine target. The extent of its antigen diversity and impact on clinical outcomes remain poorly understood. Through amplicon deep-sequencing placental malaria samples from women in Malawi and Benin, we assessed sequence diversity of VAR2CSA’s ID1-DBL2x region, containing putative vaccine targets and estimated associations of specific clades with adverse birth outcomes. Overall, var2csa diversity was high and haplotypes subdivided into five clades, the largest two defined by homology to parasites strains, 3D7 or FCR3. Across both cohorts, compared to women infected with only FCR3-like variants, women infected with only 3D7-like variants delivered infants with lower birthweight (difference: -267.99?g; 95% Confidence Interval [CI]: -466.43?g,-69.55?g) and higher odds of low birthweight (<2500?g) (Odds Ratio [OR] 5.41; 95% CI:0.99,29.52) and small-for-gestational-age (OR: 3.65; 95% CI: 1.01,13.38). In two distinct malaria-endemic African settings, parasites harboring 3D7-like variants of VAR2CSA were associated with worse birth outcomes, supporting differential effects of infection with specific parasite strains. The immense diversity coupled with differential clinical effects of this diversity suggest that an effective VAR2CSA-based vaccine may require multivalent activity.


July 19, 2019

Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters.

The ascomycete fungus Colletotrichum higginsianum causes anthracnose disease of brassica crops and the model plant Arabidopsis thaliana. Previous versions of the genome sequence were highly fragmented, causing errors in the prediction of protein-coding genes and preventing the analysis of repetitive sequences and genome architecture. Here, we re-sequenced the genome using single-molecule real-time (SMRT) sequencing technology and, in combination with optical map data, this provided a gapless assembly of all twelve chromosomes except for the ribosomal DNA repeat cluster on chromosome 7. The more accurate gene annotation made possible by this new assembly revealed a large repertoire of secondary metabolism (SM) key genes (89) and putative biosynthetic pathways (77 SM gene clusters). The two mini-chromosomes differed from the ten core chromosomes in being repeat- and AT-rich and gene-poor but were significantly enriched with genes encoding putative secreted effector proteins. Transposable elements (TEs) were found to occupy 7% of the genome by length. Certain TE families showed a statistically significant association with effector genes and SM cluster genes and were transcriptionally active at particular stages of fungal development. All 24 subtelomeres were found to contain one of three highly-conserved repeat elements which, by providing sites for homologous recombination, were probably instrumental in four segmental duplications.The gapless genome of C. higginsianum provides access to repeat-rich regions that were previously poorly assembled, notably the mini-chromosomes and subtelomeres, and allowed prediction of the complete SM gene repertoire. It also provides insights into the potential role of TEs in gene and genome evolution and host adaptation in this asexual pathogen.


July 19, 2019

A novel approach using long-read sequencing and ddPCR to investigate gonadal mosaicism and estimate recurrence risk in two families with developmental disorders.

De novo mutations contribute significantly to severe early-onset genetic disorders. Even if the mutation is apparently de novo, there is a recurrence risk due to parental germ line mosaicism, depending on in which gonadal generation the mutation occurred.We demonstrate the power of using SMRT sequencing and ddPCR to determine parental origin and allele frequencies of de novo mutations in germ cells in two families whom had undergone assisted reproduction.In the first family, a TCOF1 variant c.3156C>T was identified in the proband with Treacher Collins syndrome. The variant affects splicing and was determined to be of paternal origin. It was present in <1% of the paternal germ cells, suggesting a very low recurrence risk. In the second family, the couple had undergone several unsuccessful pregnancies where a de novo mutation PTPN11 c.923A>C causing Noonan syndrome was identified. The variant was present in 40% of the paternal germ cells suggesting a high recurrence risk.Our findings highlight a successful strategy to identify the parental origin of mutations and to investigate the recurrence risk in couples that have undergone assisted reproduction with an unknown donor or in couples with gonadal mosaicism that will undergo preimplantation genetic diagnosis.© 2017 The Authors Prenatal Diagnosis published by John Wiley & Sons Ltd.


July 19, 2019

HIV envelope glycoform heterogeneity and localized diversity govern the initiation and maturation of a V2 apex broadly neutralizing antibody lineage.

Understanding how broadly neutralizing antibodies (bnAbs) to HIV envelope (Env) develop during natural infection can help guide the rational design of an HIV vaccine. Here, we described a bnAb lineage targeting the Env V2 apex and the Ab-Env co-evolution that led to development of neutralization breadth. The lineage Abs bore an anionic heavy chain complementarity-determining region 3 (CDRH3) of 25 amino acids, among the shortest known for this class of Abs, and achieved breadth with only 10% nucleotide somatic hypermutation and no insertions or deletions. The data suggested a role for Env glycoform heterogeneity in the activation of the lineage germline B cell. Finally, we showed that localized diversity at key V2 epitope residues drove bnAb maturation toward breadth, mirroring the Env evolution pattern described for another donor who developed V2-apex targeting bnAbs. Overall, these findings suggest potential strategies for vaccine approaches based on germline-targeting and serial immunogen design. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.


July 19, 2019

Mapping the landscape of tandem repeat variability by targeted long read single molecule sequencing in familial X-linked intellectual disability.

The etiology of more than half of all patients with X-linked intellectual disability remains elusive, despite array-based comparative genomic hybridization, whole exome or genome sequencing. Since short read massive parallel sequencing approaches do not allow the detection of larger tandem repeat expansions, we hypothesized that such expansions could be a hidden cause of X-linked intellectual disability.We selectively captured over 1800 tandem repeats on the X chromosome and characterized them by long read single molecule sequencing in 3 families with idiopathic X-linked intellectual disability. In male DNA samples, full tandem repeat length sequences were obtained for 88-93% of the targets and up to 99.6% of the repeats with a moderate guanine-cytosine content. Read length and analysis pipeline allow to detect cases of >?900?bp tandem repeat expansion. In one family, one repeat expansion co-occurs with down-regulation of the neighboring MIR222 gene. This gene has previously been implicated in intellectual disability and is apparently linked to FMR1 and NEFH overexpression associated with neurological disorders.This study demonstrates the power of single molecule sequencing to measure tandem repeat lengths and detect expansions, and suggests that tandem repeat mutations may be a hidden cause of X-linked intellectual disability.


July 7, 2019

Complete genome sequence of Yersinia ruckeri strain CSF007-82, etiologic agent of red mouth disease in salmonid fish.

We present the complete, closed, and finished chromosomal and extrachromosomal genome sequences of Yersinia ruckeri strain CSF007-82, the etiologic agent of enteric red mouth disease in salmonid fish. The chromosome is 3,799,036 bp with a G+C content of 47.5% and encodes 3,530 predicted coding sequences (CDS), 7 ribosomal operons, and 80 tRNAs. Copyright © 2015 Nelson et al.


July 7, 2019

Complete genome sequence of a carbapenem-resistant extraintestinal pathogenic Escherichia coli strain belonging to the sequence type 131 H30R subclade.

Here, we report the completed genome sequence of a carbapenem-resistant extraintestinal pathogenic Escherichia coli sequence type 131 (ST131) isolate, MNCRE44. The isolate was obtained in 2012 in Minnesota, USA, from a sputum sample from a hospitalized patient with multiple comorbidities, and it belongs to the H30R sublineage. Copyright © 2015 Johnson et al.


July 7, 2019

Complete genome sequencing of protease-producing novel Arthrobacter sp. strain IHBB 11108 using PacBio Single-Molecule Real-Time Sequencing technology.

A previously uncharacterized species of the genus Arthrobacter, strain IHBB 11108 (MCC 2780), is a Gram-positive, strictly aerobic, nonmotile, cold-adapted, and protease-producing alkaliphilic actinobacterium, isolated from shallow undersurface water from Chandra Tal Lake, Lahaul-Spiti, India. The complete genome of the strain is 3.6 Mb in size with an average 58.97% G+C content.


July 7, 2019

Complete genome sequence of Serratia multitudinisentens RB-25(T), a novel chitinolytic bacterium.

Serratia multitudinisentens RB-25(T) (=DSM 28811(T) =LMG 28304(T)) is a newly proposed type strain in the genus of Serratia isolated from a municipal landfill site. Here, we present the complete genome of S. multitudinisentens RB-25(T) which contains a complete chitinase operon and other chitin and N-acetylglucosamine utilisation enzymes. To our knowledge, this is the first report of the complete genome sequence of this novel isolate and its chitinase gene discovery. Copyright © 2015 Elsevier B.V. All rights reserved.


July 7, 2019

Complete genome sequence of Microcystis aeruginosa NIES-2549, a bloom-forming cyanobacterium from Lake Kasumigaura, Japan.

Microcystis aeruginosa NIES-2549 is a freshwater bloom-forming cyanobacterium isolated from Lake Kasumigaura, Japan. We report the complete 4.29-Mbp genome sequence of NIES-2549 and its annotation and discuss the genetic diversity of M. aeruginosa strains. This is the third genome sequence of M. aeruginosa isolated from Lake Kasumigaura. Copyright © 2015 Yamaguchi et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.