Menu
July 19, 2019

SMRT genome assembly corrects reference errors, resolving the genetic basis of virulence in Mycobacterium tuberculosis.

The genetic basis of virulence in Mycobacterium tuberculosis has been investigated through genome comparisons of virulent (H37Rv) and attenuated (H37Ra) sister strains. Such analysis, however, relies heavily on the accuracy of the sequences. While the H37Rv reference genome has had several corrections to date, that of H37Ra is unmodified since its original publication.Here, we report the assembly and finishing of the H37Ra genome from single-molecule, real-time (SMRT) sequencing. Our assembly reveals that the number of H37Ra-specific variants is less than half of what the Sanger-based H37Ra reference sequence indicates, undermining and, in some cases, invalidating the conclusions of several studies. PE_PPE family genes, which are intractable to commonly-used sequencing platforms because of their repetitive and GC-rich nature, are overrepresented in the set of genes in which all reported H37Ra-specific variants are contradicted. Further, one of the sequencing errors in H37Ra masks a true variant in common with the clinical strain CDC1551 which, when considered in the context of previous work, corresponds to a sequencing error in the H37Rv reference genome.Our results constrain the set of genomic differences possibly affecting virulence by more than half, which focuses laboratory investigation on pertinent targets and demonstrates the power of SMRT sequencing for producing high-quality reference genomes.


July 19, 2019

Complete genome sequence of Tessaracoccus sp. strain T2.5-30 isolated from 139.5 meters deep on the subsurface of the Iberian Pyritic Belt.

Here, we report the complete genome sequence of Tessaracoccus sp. strain T2.5-30, which consists of a chromosome with 3.2 Mbp, 70.4% G+C content, and 3,005 coding DNA sequences. The strain was isolated from a rock core retrieved at a depth of 139.5 m in the subsurface of the Iberian Pyritic Belt (Spain). Copyright © 2017 Leandro et al.


July 19, 2019

Complete genome sequence of Vibrio campbellii strain 20130629003S01 isolated from shrimp with acute hepatopancreatic necrosis disease.

Vibrio campbellii is widely distributed in the marine environment and is an important pathogen of aquatic organisms such as shrimp, fish, and mollusks. An isolate of V. campbellii carrying the pirAB(vp) gene, causing acute hepatopancreatic necrosis disease (AHPND), has been reported. There are no previous reports about the complete genome of V. campbellii causing AHPND (VCAHPND). To extend our understanding of the pathogenesis of VCAHPND at the genomic level, the genome of V. campbellii 20130629003S01 isolated from a shrimp with AHPND was sequenced and analysed.The complete genome sequence of V. campbellii 20130629003S01 was generated using the PacBio RSII platform with single molecule, real-time sequencing. The 20130629003S01 strain consists of two circular chromosomes (3,621,712 bp in chromosome 1 and 2,245,751 bp in chromosome 2) and four plasmids of 70,066, 204,531, 143,140, and 86,121 bp. The genome contains a total of 5855 protein coding genes, 134 tRNA genes and 37 rRNA genes. The average nucleotide identity value of 20130629003S01 and other reference V. campbellii strains was 97.46%, suggesting that they are closely related.The genome sequence of V. campbellii 20130629003S01 and its comparative analysis with other V. campbellii strains that we present here are important for a better understanding of the genomic characteristics of VCAHPND.


July 19, 2019

Contrasting evolutionary genome dynamics between domesticated and wild yeasts.

Structural rearrangements have long been recognized as an important source of genetic variation, with implications in phenotypic diversity and disease, yet their detailed evolutionary dynamics remain elusive. Here we use long-read sequencing to generate end-to-end genome assemblies for 12 strains representing major subpopulations of the partially domesticated yeast Saccharomyces cerevisiae and its wild relative Saccharomyces paradoxus. These population-level high-quality genomes with comprehensive annotation enable precise definition of chromosomal boundaries between cores and subtelomeres and a high-resolution view of evolutionary genome dynamics. In chromosomal cores, S. paradoxus shows faster accumulation of balanced rearrangements (inversions, reciprocal translocations and transpositions), whereas S. cerevisiae accumulates unbalanced rearrangements (novel insertions, deletions and duplications) more rapidly. In subtelomeres, both species show extensive interchromosomal reshuffling, with a higher tempo in S. cerevisiae. Such striking contrasts between wild and domesticated yeasts are likely to reflect the influence of human activities on structural genome evolution.


July 19, 2019

Comparative analysis of extended-spectrum-ß-lactamase CTX-M-65-producing Salmonella enterica serovar Infantis isolates from humans, food animals, and retail chickens in the United States.

We sequenced the genomes of ten Salmonella enterica serovar Infantis containing blaCTX-M-65 isolated from chicken, cattle, and human sources collected between 2012 and 2015 in the United States through routine NARMS surveillance and product sampling programs. We also completely assembled the plasmids from four of the isolates. All isolates had a D87Y mutation in the gyrA gene and harbored between 7 and 10 resistance genes (aph (4)-Ia, aac (3)-IVa, aph(3′ )-Ic, blaCTX-M-65, fosA3, floR, dfrA14, sul1, tetA, aadA1) located in two distinct sites of a megaplasmid (~316-323kb) similar to that described in a blaCTX-M-65-positive S. Infantis isolated from a patient in Italy. High-quality single nucleotide polymorphism (hqSNP) analysis revealed that all U.S. isolates were closely related, separated by only 1 to 38 pairwise high quality SNPs, indicating a high likelihood that strains from humans, chicken, and cattle recently evolved from a common ancestor. The U.S. isolates were genetically similar to the blaCTX-M-65-positive S. Infantis isolate from Italy, with a separation of 34 to 47 SNPs. This is the first report of the blaCTX-M-65 gene and the pESI-like megaplasmid from S. Infantis in the United States, and illustrates the importance of applying a global One Health, human and animal perspective to combat antimicrobial resistance. Copyright © 2017 American Society for Microbiology.


July 19, 2019

First report of two complete Clostridium chauvoei genome sequences and detailed in silico genome analysis.

Clostridium (C.) chauvoei is a Gram-positive, spore forming, anaerobic bacterium. It causes black leg in ruminants, a typically fatal histotoxic myonecrosis. High quality circular genome sequences were generated for the C. chauvoei type strain DSM 7528(T) (ATCC 10092(T)) and a field strain 12S0467 isolated in Germany. The origin of replication (oriC) was comparable to that of Bacillus subtilis in structure with two regions containing DnaA boxes. Similar prophages were identified in the genomes of both C. chauvoei strains which also harbored hemolysin and bacterial spore formation genes. A CRISPR type I-B system with limited variations in the repeat number was identified. Sporulation and germination process related genes were homologous to that of the Clostridia cluster I group but novel variations for regulatory genes were identified indicative for strain specific control of regulatory events. Phylogenomics showed a higher relatedness to C. septicum than to other so far sequenced genomes of species belonging to the genus Clostridium. Comparative genome analysis of three C. chauvoei circular genome sequences revealed the presence of few inversions and translocations in locally collinear blocks (LCBs). The species genome also shows a large number of genes involved in proteolysis, genes for glycosyl hydrolases and metal iron transportation genes which are presumably involved in virulence and survival in the host. Three conserved flagellar genes (fliC) were identified in each of the circular genomes. In conclusion this is the first comparative analysis of circular genomes for the species C. chauvoei, enabling insights into genome composition and virulence factor variation. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.


July 19, 2019

PacBio but not Illumina technology can achieve fast, accurate and complete closure of the high GC, complex Burkholderia pseudomallei two-chromosome genome

Although PacBio third-generation sequencers have improved the read lengths of genome sequencing which facilitates the assembly of complete genomes, no study has reported success in using PacBio data alone to completely sequence a two-chromosome bacterial genome from a single library in a single run. Previous studies using earlier versions of sequencing chemistries have at most been able to finish bacterial genomes containing only one chromosome with de novo assembly. In this study, we compared the robustness of PacBio RS II, using one SMRT cell and the latest P6-C4 chemistry, with Illumina HiSeq 1500 in sequencing the genome of Burkholderia pseudomallei, a bacterium which contains two large circular chromosomes, very high G+C content of 68–69%, highly repetitive regions and substantial genomic diversity, and represents one of the largest and most complex bacterial genomes sequenced, using a reference genome generated by hybrid assembly using PacBio and Illumina datasets with subsequent manual validation. Results showed that PacBio data with de novo assembly, but not Illumina, was able to completely sequence the B. pseudomallei genome without any gaps or mis-assemblies. The two large contigs of the PacBio assembly aligned unambiguously to the reference genome, sharing >99.9% nucleotide identities. Conversely, Illumina data assembled using three different assemblers resulted in fragmented assemblies (201–366 contigs), sharing only 92.2–100% and 92.0–100% nucleotide identities to chromosomes I and II reference sequences, respectively, with no indication that the B. pseudomallei genome consisted of two chromosomes with four copies of ribosomal operons. Among all assemblies, the PacBio assembly recovered the highest number of core and virulence proteins, and housekeeping genes based on whole-genome multilocus sequence typing (wgMLST). Most notably, assembly solely based on PacBio outperformed even hybrid assembly using both PacBio and Illumina datasets. Hybrid approach generated only 74 contigs, while the PacBio data alone with de novo assembly achieved complete closure of the two-chromosome B. pseudomallei genome without additional costly bench work and further sequencing. PacBio RS II using P6-C4 chemistry is highly robust and cost-effective and should be the platform of choice in sequencing bacterial genomes, particularly for those that are well-known to be difficult-to-sequence.


July 19, 2019

A mobile pathogenicity chromosome in Fusarium oxysporum for infection of multiple cucurbit species.

The genome of Fusarium oxysporum (Fo) consists of a set of eleven ‘core’ chromosomes, shared by most strains and responsible for housekeeping, and one or several accessory chromosomes. We sequenced a strain of Fo f.sp. radicis-cucumerinum (Forc) using PacBio SMRT sequencing. All but one of the core chromosomes were assembled into single contigs, and a chromosome that shows all the hallmarks of a pathogenicity chromosome comprised two contigs. A central part of this chromosome contains all identified candidate effector genes, including homologs of SIX6, SIX9, SIX11 and SIX 13. We show that SIX6 contributes to virulence of Forc. Through horizontal chromosome transfer (HCT) to a non-pathogenic strain, we also show that the accessory chromosome containing the SIX gene homologs is indeed a pathogenicity chromosome for cucurbit infection. Conversely, complete loss of virulence was observed in Forc016 strains that lost this chromosome. We conclude that also a non-wilt-inducing Fo pathogen relies on effector proteins for successful infection and that the Forc pathogenicity chromosome contains all the information necessary for causing root rot of cucurbits. Three out of nine HCT strains investigated have undergone large-scale chromosome alterations, reflecting the remarkable plasticity of Fo genomes.


July 19, 2019

Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters.

The ascomycete fungus Colletotrichum higginsianum causes anthracnose disease of brassica crops and the model plant Arabidopsis thaliana. Previous versions of the genome sequence were highly fragmented, causing errors in the prediction of protein-coding genes and preventing the analysis of repetitive sequences and genome architecture. Here, we re-sequenced the genome using single-molecule real-time (SMRT) sequencing technology and, in combination with optical map data, this provided a gapless assembly of all twelve chromosomes except for the ribosomal DNA repeat cluster on chromosome 7. The more accurate gene annotation made possible by this new assembly revealed a large repertoire of secondary metabolism (SM) key genes (89) and putative biosynthetic pathways (77 SM gene clusters). The two mini-chromosomes differed from the ten core chromosomes in being repeat- and AT-rich and gene-poor but were significantly enriched with genes encoding putative secreted effector proteins. Transposable elements (TEs) were found to occupy 7% of the genome by length. Certain TE families showed a statistically significant association with effector genes and SM cluster genes and were transcriptionally active at particular stages of fungal development. All 24 subtelomeres were found to contain one of three highly-conserved repeat elements which, by providing sites for homologous recombination, were probably instrumental in four segmental duplications.The gapless genome of C. higginsianum provides access to repeat-rich regions that were previously poorly assembled, notably the mini-chromosomes and subtelomeres, and allowed prediction of the complete SM gene repertoire. It also provides insights into the potential role of TEs in gene and genome evolution and host adaptation in this asexual pathogen.


July 19, 2019

De novo assembly of genomes from long sequence reads reveals uncharted territories of Propionibacterium freudenreichii.

Propionibacterium freudenreichii is an industrially important bacterium granted the Generally Recognized as Safe (the GRAS) status, due to its long safe use in food bioprocesses. Despite the recognized role in the food industry and in the production of vitamin B12, as well as its documented health-promoting potential, P. freudenreichii remained poorly characterised at the genomic level. At present, only three complete genome sequences are available for the species.We used the PacBio RS II sequencing platform to generate complete genomes of 20 P. freudenreichii strains and compared them in detail. Comparative analyses revealed both sequence conservation and genome organisational diversity among the strains. Assembly from long reads resulted in the discovery of additional circular elements: two putative conjugative plasmids and three active, lysogenic bacteriophages. It also permitted characterisation of the CRISPR-Cas systems. The use of the PacBio sequencing platform allowed identification of DNA modifications, which in turn allowed characterisation of the restriction-modification systems together with their recognition motifs. The observed genomic differences suggested strain variation in surface piliation and specific mucus binding, which were validated by experimental studies. The phenotypic characterisation displayed large diversity between the strains in ability to utilise a range of carbohydrates, to grow at unfavourable conditions and to form a biofilm.The complete genome sequencing allowed detailed characterisation of the industrially important species, P. freudenreichii by facilitating the discovery of previously unknown features. The results presented here lay a solid foundation for future genetic and functional genomic investigations of this actinobacterial species.


July 19, 2019

Methylation in Mycobacterium tuberculosis is lineage specific with associated mutations present globally.

DNA methylation is an epigenetic modification of the genome involved in regulating crucial cellular processes, including transcription and chromosome stability. Advances in PacBio sequencing technologies can be used to robustly reveal methylation sites. The methylome of the Mycobacterium tuberculosis complex is poorly understood but may be involved in virulence, hypoxic survival and the emergence of drug resistance. In the most extensive study to date, we characterise the methylome across the 4 major lineages of M. tuberculosis and 2 lineages of M. africanum, the leading causes of tuberculosis disease in humans. We reveal lineage-specific methylated motifs and strain-specific mutations that are abundant globally and likely to explain loss of function in the respective methyltransferases. Our work provides a set of sixteen new complete reference genomes for the Mycobacterium tuberculosis complex, including complete lineage 5 genomes. Insights into lineage-specific methylomes will further elucidate underlying biological mechanisms and other important phenotypes of the epi-genome.


July 19, 2019

Resolving the complete genome of Kuenenia stuttgartiensis from a membrane bioreactor enrichment using Single-Molecule Real-Time sequencing.

Anaerobic ammonium-oxidizing (anammox) bacteria are a group of strictly anaerobic chemolithoautotrophic microorganisms. They are capable of oxidizing ammonium to nitrogen gas using nitrite as a terminal electron acceptor, thereby facilitating the release of fixed nitrogen into the atmosphere. The anammox process is thought to exert a profound impact on the global nitrogen cycle and has been harnessed as an environment-friendly method for nitrogen removal from wastewater. In this study, we present the first closed genome sequence of an anammox bacterium, Kuenenia stuttgartiensis MBR1. It was obtained through Single-Molecule Real-Time (SMRT) sequencing of an enrichment culture constituting a mixture of at least two highly similar Kuenenia strains. The genome of the novel MBR1 strain is different from the previously reported Kuenenia KUST reference genome as it contains numerous structural variations and unique genomic regions. We find new proteins, such as a type 3b (sulf)hydrogenase and an additional copy of the hydrazine synthase gene cluster. Moreover, multiple copies of ammonium transporters and proteins regulating nitrogen uptake were identified, suggesting functional differences in metabolism. This assembly, including the genome-wide methylation profile, provides a new foundation for comparative and functional studies aiming to elucidate the biochemical and metabolic processes of these organisms.


July 19, 2019

Long-read sequence assembly of the firefly Pyrocoelia pectoralis genome.

Fireflies are a family of insects within the beetle order Coleoptera, or winged beetles, and they are one of the most well-known and loved insect species because of their bioluminescence. However, the firefly is in danger of extinction because of the massive destruction of its living environment. In order to improve the understanding of fireflies and protect them effectively, we sequenced the whole genome of the terrestrial firefly Pyrocoelia pectoralis.Here, we developed a highly reliable genome resource for the terrestrial firefly Pyrocoelia pectoralis (E. Oliv., 1883; Coleoptera: Lampyridae) using single molecule real time (SMRT) sequencing on the PacBio Sequel platform. In total, 57.8 Gb of long reads were generated and assembled into a 760.4-Mb genome, which is close to the estimated genome size and covered 98.7% complete and 0.7% partial insect Benchmarking Universal Single-Copy Orthologs. The k-mer analysis showed that this genome is highly heterozygous. However, our long-read assembly demonstrates continuousness with a contig N50 length of 3.04 Mb and the longest contig length of 13.69 Mb. Furthermore, 135 589 SSRs and 341 Mb of repeat sequences were detected. A total of 23 092 genes were predicted; 88.44% of genes were annotated with one or more related functions.We assembled a high-quality firefly genome, which will not only provide insights into the conservation and biodiversity of fireflies, but also provide a wealth of information to study the mechanisms of their sexual communication, bio-luminescence, and evolution.© The Authors 2017. Published by Oxford University Press.


July 19, 2019

Expanding an expanded genome: long-read sequencing of Trypanosoma cruzi.

Although the genome of Trypanosoma cruzi, the causative agent of Chagas disease, was first made available in 2005, with additional strains reported later, the intrinsic genome complexity of this parasite (the abundance of repetitive sequences and genes organized in tandem) has traditionally hindered high-quality genome assembly and annotation. This also limits diverse types of analyses that require high degrees of precision. Long reads generated by third-generation sequencing technologies are particularly suitable to address the challenges associated with T. cruzi’s genome since they permit direct determination of the full sequence of large clusters of repetitive sequences without collapsing them. This, in turn, not only allows accurate estimation of gene copy numbers but also circumvents assembly fragmentation. Here, we present the analysis of the genome sequences of two T. cruzi clones: the hybrid TCC (TcVI) and the non-hybrid Dm28c (TcI), determined by PacBio Single Molecular Real-Time (SMRT) technology. The improved assemblies herein obtained permitted us to accurately estimate gene copy numbers, abundance and distribution of repetitive sequences (including satellites and retroelements). We found that the genome of T. cruzi is composed of a ‘core compartment’ and a ‘disruptive compartment’ which exhibit opposite GC content and gene composition. Novel tandem and dispersed repetitive sequences were identified, including some located inside coding sequences. Additionally, homologous chromosomes were separately assembled, allowing us to retrieve haplotypes as separate contigs instead of a unique mosaic sequence. Finally, manual annotation of surface multigene families, mucins and trans-sialidases allows now a better overview of these complex groups of genes.


July 19, 2019

Male-killing toxin in a bacterial symbiont of Drosophila.

Several lineages of symbiotic bacteria in insects selfishly manipulate host reproduction to spread in a population 1 , often by distorting host sex ratios. Spiroplasma poulsonii2,3 is a helical and motile, Gram-positive symbiotic bacterium that resides in a wide range of Drosophila species 4 . A notable feature of S. poulsonii is male killing, whereby the sons of infected female hosts are selectively killed during development1,2. Although male killing caused by S. poulsonii has been studied since the 1950s, its underlying mechanism is unknown. Here we identify an S. poulsonii protein, designated Spaid, whose expression induces male killing. Overexpression of Spaid in D. melanogaster kills males but not females, and induces massive apoptosis and neural defects, recapitulating the pathology observed in S. poulsonii-infected male embryos5-11. Our data suggest that Spaid targets the dosage compensation machinery on the male X chromosome to mediate its effects. Spaid contains ankyrin repeats and a deubiquitinase domain, which are required for its subcellular localization and activity. Moreover, we found a laboratory mutant strain of S. poulsonii with reduced male-killing ability and a large deletion in the spaid locus. Our study has uncovered a bacterial protein that affects host cellular machinery in a sex-specific way, which is likely to be the long-searched-for factor responsible for S. poulsonii-induced male killing.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.