Menu
July 19, 2019

Comprehensive bioinformatics analysis of Mycoplasma pneumoniae genomes to investigate underlying population structure and type-specific determinants.

Mycoplasma pneumoniae is a significant cause of respiratory illness worldwide. Despite a minimal and highly conserved genome, genetic diversity within the species may impact disease. We performed whole genome sequencing (WGS) analysis of 107 M. pneumoniae isolates, including 67 newly sequenced using the Pacific BioSciences RS II and/or Illumina MiSeq sequencing platforms. Comparative genomic analysis of 107 genomes revealed >3,000 single nucleotide polymorphisms (SNPs) in total, including 520 type-specific SNPs. Population structure analysis supported the existence of six distinct subgroups, three within each type. We developed a predictive model to classify an isolate based on whole genome SNPs called against the reference genome into the identified subtypes, obviating the need for genome assembly. This study is the most comprehensive WGS analysis for M. pneumoniae to date, underscoring the power of combining complementary sequencing technologies to overcome difficult-to-sequence regions and highlighting potential differential genomic signatures in M. pneumoniae.


July 19, 2019

A new chicken genome assembly provides insight into avian genome structure.

The importance of the Gallus gallus (chicken) as a model organism and agricultural animal merits a continuation of sequence assembly improvement efforts. We present a new version of the chicken genome assembly (Gallus_gallus-5.0; GCA_000002315.3), built from combined long single molecule sequencing technology, finished BACs, and improved physical maps. In overall assembled bases, we see a gain of 183 Mb, including 16.4 Mb in placed chromosomes with a corresponding gain in the percentage of intact repeat elements characterized. Of the 1.21 Gb genome, we include three previously missing autosomes, GGA30, 31, and 33, and improve sequence contig length 10-fold over the previous Gallus_gallus-4.0. Despite the significant base representation improvements made, 138 Mb of sequence is not yet located to chromosomes. When annotated for gene content, Gallus_gallus-5.0 shows an increase of 4679 annotated genes (2768 noncoding and 1911 protein-coding) over those in Gallus_gallus-4.0. We also revisited the question of what genes are missing in the avian lineage, as assessed by the highest quality avian genome assembly to date, and found that a large fraction of the original set of missing genes are still absent in sequenced bird species. Finally, our new data support a detailed map of MHC-B, encompassing two segments: one with a highly stable gene copy number and another in which the gene copy number is highly variable. The chicken model has been a critical resource for many other fields of study, and this new reference assembly will substantially further these efforts. Copyright © 2017 Warren et al.


July 19, 2019

Revealing complete complex KIR haplotypes phased by long-read sequencing technology

The killer cell immunoglobulin-like receptor (KIR) region of human chromosome 19 contains up to 16 genes for natural killer (NK) cell receptors that recognize human leukocyte antigen (HLA)/peptide complexes and other ligands. The KIR proteins fulfill functional roles in infections, pregnancy, autoimmune diseases and transplantation. However, their characterization remains a constant challenge. Not only are the genes highly homologous due to their recent evolution by tandem duplications, but the region is structurally dynamic due to frequent transposon-mediated recombination. A sequencing approach that precisely captures the complexity of KIR haplotypes for functional annotation is desirable. We present a unique approach to haplotype the KIR loci using single-molecule, real-time (SMRT) sequencing. Using this method, we have—for the first time—comprehensively sequenced and phased sixteen KIR haplotypes from eight individuals without imputation. The information revealed four novel haplotype structures, a novel gene-fusion allele, novel and confirmed insertion/deletion events, a homozygous individual, and overall diversity for the structural haplotypes and their alleles. These KIR haplotypes augment our existing knowledge by providing high-quality references, evolutionary informers, and source material for imputation. The haplotype sequences and gene annotations provide alternative loci for the KIR region in the human genome reference GrCh38.p8.


July 19, 2019

Complete genome sequences of isolates of Enterococcus faecium sequence type 117, a globally disseminated multidrug-resistant clone.

The emergence of nosocomial infections by multidrug-resistant sequence type 117 (ST117) Enterococcus faecium has been reported in several European countries. ST117 has been detected in Spanish hospitals as one of the main causes of bloodstream infections. We analyzed genome variations of ST117 strains isolated in Madrid and describe the first ST117 closed genome sequences. Copyright © 2017 Tedim et al.


July 19, 2019

The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution.

The domesticated sunflower, Helianthus annuus L., is a global oil crop that has promise for climate change adaptation, because it can maintain stable yields across a wide variety of environmental conditions, including drought. Even greater resilience is achievable through the mining of resistance alleles from compatible wild sunflower relatives, including numerous extremophile species. Here we report a high-quality reference for the sunflower genome (3.6 gigabases), together with extensive transcriptomic data from vegetative and floral organs. The genome mostly consists of highly similar, related sequences and required single-molecule real-time sequencing technologies for successful assembly. Genome analyses enabled the reconstruction of the evolutionary history of the Asterids, further establishing the existence of a whole-genome triplication at the base of the Asterids II clade and a sunflower-specific whole-genome duplication around 29 million years ago. An integrative approach combining quantitative genetics, expression and diversity data permitted development of comprehensive gene networks for two major breeding traits, flowering time and oil metabolism, and revealed new candidate genes in these networks. We found that the genomic architecture of flowering time has been shaped by the most recent whole-genome duplication, which suggests that ancient paralogues can remain in the same regulatory networks for dozens of millions of years. This genome represents a cornerstone for future research programs aiming to exploit genetic diversity to improve biotic and abiotic stress resistance and oil production, while also considering agricultural constraints and human nutritional needs.


July 19, 2019

Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome.

Utricularia gibba, the humped bladderwort, is a carnivorous plant that retains a tiny nuclear genome despite at least two rounds of whole genome duplication (WGD) since common ancestry with grapevine and other species. We used a third-generation genome assembly with several complete chromosomes to reconstruct the two most recent lineage-specific ancestral genomes that led to the modern U. gibba genome structure. Patterns of subgenome dominance in the most recent WGD, both architectural and transcriptional, are suggestive of allopolyploidization, which may have generated genomic novelty and led to instantaneous speciation. Syntenic duplicates retained in polyploid blocks are enriched for transcription factor functions, whereas gene copies derived from ongoing tandem duplication events are enriched in metabolic functions potentially important for a carnivorous plant. Among these are tandem arrays of cysteine protease genes with trap-specific expression that evolved within a protein family known to be useful in the digestion of animal prey. Further enriched functions among tandem duplicates (also with trap-enhanced expression) include peptide transport (intercellular movement of broken-down prey proteins), ATPase activities (bladder-trap acidification and transmembrane nutrient transport), hydrolase and chitinase activities (breakdown of prey polysaccharides), and cell-wall dynamic components possibly associated with active bladder movements. Whereas independently polyploid Arabidopsis syntenic gene duplicates are similarly enriched for transcriptional regulatory activities, Arabidopsis tandems are distinct from those of U. gibba, while still metabolic and likely reflecting unique adaptations of that species. Taken together, these findings highlight the special importance of tandem duplications in the adaptive landscapes of a carnivorous plant genome.


July 19, 2019

SMRT genome assembly corrects reference errors, resolving the genetic basis of virulence in Mycobacterium tuberculosis.

The genetic basis of virulence in Mycobacterium tuberculosis has been investigated through genome comparisons of virulent (H37Rv) and attenuated (H37Ra) sister strains. Such analysis, however, relies heavily on the accuracy of the sequences. While the H37Rv reference genome has had several corrections to date, that of H37Ra is unmodified since its original publication.Here, we report the assembly and finishing of the H37Ra genome from single-molecule, real-time (SMRT) sequencing. Our assembly reveals that the number of H37Ra-specific variants is less than half of what the Sanger-based H37Ra reference sequence indicates, undermining and, in some cases, invalidating the conclusions of several studies. PE_PPE family genes, which are intractable to commonly-used sequencing platforms because of their repetitive and GC-rich nature, are overrepresented in the set of genes in which all reported H37Ra-specific variants are contradicted. Further, one of the sequencing errors in H37Ra masks a true variant in common with the clinical strain CDC1551 which, when considered in the context of previous work, corresponds to a sequencing error in the H37Rv reference genome.Our results constrain the set of genomic differences possibly affecting virulence by more than half, which focuses laboratory investigation on pertinent targets and demonstrates the power of SMRT sequencing for producing high-quality reference genomes.


July 19, 2019

An integrated strategy combining DNA walking and NGS to detect GMOs.

Recently, we developed a DNA walking system for the detection and characterization of a broad spectrum of GMOs in routine analysis of food/feed matrices. Here, we present a new version with improved throughput and sensitivity by coupling the DNA walking system to Pacific Bioscience® Next-generation sequencing technology. The performance of the new strategy was thoroughly assessed through several assays. First, we tested its detection and identification capability on grains with high or low GMO content. Second, the potential impacts of food processing were investigated using rice noodle samples. Finally, GMO mixtures and a real-life sample were analyzed to illustrate the applicability of the proposed strategy in routine GMO analysis. In all tested samples, the presence of multiple GMOs was unambiguously proven by the characterization of transgene flanking regions and the combinations of elements that are typical for transgene constructs. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.


July 19, 2019

Widespread adenine N6-methylation of active genes in fungi.

N6-methyldeoxyadenine (6mA) is a noncanonical DNA base modification present at low levels in plant and animal genomes, but its prevalence and association with genome function in other eukaryotic lineages remains poorly understood. Here we report that abundant 6mA is associated with transcriptionally active genes in early-diverging fungal lineages. Using single-molecule long-read sequencing of 16 diverse fungal genomes, we observed that up to 2.8% of all adenines were methylated in early-diverging fungi, far exceeding levels observed in other eukaryotes and more derived fungi. 6mA occurred symmetrically at ApT dinucleotides and was concentrated in dense methylated adenine clusters surrounding the transcriptional start sites of expressed genes; its distribution was inversely correlated with that of 5-methylcytosine. Our results show a striking contrast in the genomic distributions of 6mA and 5-methylcytosine and reinforce a distinct role for 6mA as a gene-expression-associated epigenomic mark in eukaryotes.


July 19, 2019

Complete genome sequence of Tessaracoccus sp. strain T2.5-30 isolated from 139.5 meters deep on the subsurface of the Iberian Pyritic Belt.

Here, we report the complete genome sequence of Tessaracoccus sp. strain T2.5-30, which consists of a chromosome with 3.2 Mbp, 70.4% G+C content, and 3,005 coding DNA sequences. The strain was isolated from a rock core retrieved at a depth of 139.5 m in the subsurface of the Iberian Pyritic Belt (Spain). Copyright © 2017 Leandro et al.


July 19, 2019

Dual redundant sequencing strategy: Full-length gene characterisation of 1056 novel and confirmatory HLA alleles.

The high-throughput department of DKMS Life Science Lab encounters novel human leukocyte antigen (HLA) alleles on a daily basis. To characterise these alleles, we have developed a system to sequence the whole gene from 5′- to 3′-UTR for the HLA loci A, B, C, DQB1 and DPB1 for submission to the European Molecular Biology Laboratory – European Nucleotide Archive (EMBL-ENA) and the IPD-IMGT/HLA Database. Our workflow is based on a dual redundant sequencing strategy. Using shotgun sequencing on an Illumina MiSeq instrument and single molecule real-time (SMRT) sequencing on a PacBio RS II instrument, we are able to achieve highly accurate HLA full-length consensus sequences. Remaining conflicts are resolved using the R package DR2S (Dual Redundant Reference Sequencing). Given the relatively high throughput of this strategy, we have developed the semi-automated web service TypeLoader, to aid in the submission of sequences to the EMBL-ENA and the IPD-IMGT/HLA Database. In the IPD-IMGT/HLA Database release 3.24.0 (April 2016; prior to the submission of the sequences described here), only 5.2% of all known HLA alleles have been fully characterised together with intronic and UTR sequences. So far, we have applied our strategy to characterise and submit 1056 HLA alleles, thereby more than doubling the number of fully characterised alleles. Given the increasing application of next generation sequencing (NGS) for full gene characterisation in clinical practice, extending the HLA database concomitantly is highly desirable. Therefore, we propose this dual redundant sequencing strategy as a workflow for submission of novel full-length alleles and characterisation of sequences that are as yet incomplete. This would help to mitigate the predominance of partially known alleles in the database.© 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.


July 19, 2019

Quasispecies composition and evolution of a typical Zika virus clinical isolate from Suriname.

The arthropod-borne Zika virus (ZIKV) is currently causing a major international public health threat in the Americas. This study describes the isolation of ZIKV from the plasma of a 29-year-old female traveler that developed typical symptoms, like rash, fever and headache upon return from Suriname. The complete genome sequence including the 5′ and 3′ untranslated regions was determined and phylogenetic analysis showed the isolate clustering within the Asian lineage, close to other viruses that have recently been isolated in the Americas. In addition, the viral quasispecies composition was analyzed by single molecule real time sequencing, which suggested a mutation frequency of 1.4?×?10(-4) for this ZIKV isolate. Continued passaging of the virus in cell culture led to the selection of variants with mutations in NS1 and the E protein. The latter might influence virus binding to cell surface heparan sulfate.


July 19, 2019

Complete genome sequence of Vibrio campbellii strain 20130629003S01 isolated from shrimp with acute hepatopancreatic necrosis disease.

Vibrio campbellii is widely distributed in the marine environment and is an important pathogen of aquatic organisms such as shrimp, fish, and mollusks. An isolate of V. campbellii carrying the pirAB(vp) gene, causing acute hepatopancreatic necrosis disease (AHPND), has been reported. There are no previous reports about the complete genome of V. campbellii causing AHPND (VCAHPND). To extend our understanding of the pathogenesis of VCAHPND at the genomic level, the genome of V. campbellii 20130629003S01 isolated from a shrimp with AHPND was sequenced and analysed.The complete genome sequence of V. campbellii 20130629003S01 was generated using the PacBio RSII platform with single molecule, real-time sequencing. The 20130629003S01 strain consists of two circular chromosomes (3,621,712 bp in chromosome 1 and 2,245,751 bp in chromosome 2) and four plasmids of 70,066, 204,531, 143,140, and 86,121 bp. The genome contains a total of 5855 protein coding genes, 134 tRNA genes and 37 rRNA genes. The average nucleotide identity value of 20130629003S01 and other reference V. campbellii strains was 97.46%, suggesting that they are closely related.The genome sequence of V. campbellii 20130629003S01 and its comparative analysis with other V. campbellii strains that we present here are important for a better understanding of the genomic characteristics of VCAHPND.


July 19, 2019

Comparative genomics of two sequential Candida glabrata clinical isolates.

Candida glabrata is an important fungal pathogen which develops rapid antifungal resistance in treated patients. It is known that azole treatments lead to antifungal resistance in this fungal species and that multidrug efflux transporters are involved in this process. Specific mutations in the transcriptional regulator PDR1 result in upregulation of the transporters. In addition, we showed that the PDR1 mutations can contribute to enhance virulence in animal models. In this study, we were interested to compare genomes of two specific C. glabrata-related isolates, one of which was azole susceptible (DSY562) while the other was azole resistant (DSY565). DSY565 contained a PDR1 mutation (L280F) and was isolated after a time-lapse of 50 d of azole therapy. We expected that genome comparisons between both isolates could reveal additional mutations reflecting host adaptation or even additional resistance mechanisms. The PacBio technology used here yielded 14 major contigs (sizes 0.18-1.6 Mb) and mitochondrial genomes from both DSY562 and DSY565 isolates that were highly similar to each other. Comparisons of the clinical genomes with the published CBS138 genome indicated important genome rearrangements, but not between the clinical strains. Among the unique features, several retrotransposons were identified in the genomes of the investigated clinical isolates. DSY562 and DSY565 each contained a large set of adhesin-like genes (101 and 107, respectively), which exceed by far the number of reported adhesins (63) in the CBS138 genome. Comparison between DSY562 and DSY565 yielded 17 nonsynonymous SNPs (among which the was the expected PDR1 mutation) as well as small size indels in coding regions (11) but mainly in adhesin-like genes. The genomes contained a DNA mismatch repair allele of MSH2 known to be involved in the so-called hyper-mutator phenotype of this yeast species and the number of accumulated mutations between both clinical isolates is consistent with the presence of a MSH2 defect. In conclusion, this study is the first to compare genomes of C. glabrata sequential clinical isolates using the PacBio technology as an approach. The genomes of these isolates taken in the same patient at two different time points exhibited limited variations, even if submitted to the host pressure. Copyright © 2017 Vale-Silva et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.