Menu
July 7, 2019

Comparative genomics of Beauveria bassiana: uncovering signatures of virulence against mosquitoes.

Entomopathogenic fungi such as Beauveria bassiana are promising biological agents for control of malaria mosquitoes. Indeed, infection with B. bassiana reduces the lifespan of mosquitoes in the laboratory and in the field. Natural isolates of B. bassiana show up to 10-fold differences in virulence between the most and the least virulent isolate. In this study, we sequenced the genomes of five isolates representing the extremes of low/high virulence and three RNA libraries, and applied a genome comparison approach to uncover genetic mechanisms underpinning virulence.A high-quality, near-complete genome assembly was achieved for the highly virulent isolate Bb8028, which was compared to the assemblies of the four other isolates. Whole genome analysis showed a high level of genetic diversity between the five isolates (2.85-16.8 SNPs/kb), which grouped into two distinct phylogenetic clusters. Mating type gene analysis revealed the presence of either the MAT1-1-1 or the MAT1-2-1 gene. Moreover, a putative new MAT gene (MAT1-2-8) was detected in the MAT1-2 locus. Comparative genome analysis revealed that Bb8028 contains 163 genes exclusive for this isolate. These unique genes have a tendency to cluster in the genome and to be often located near the telomeres. Among the genes unique to Bb8028 are a Non-Ribosomal Peptide Synthetase (NRPS) secondary metabolite gene cluster, a polyketide synthase (PKS) gene, and five genes with homology to bacterial toxins. A survey of candidate virulence genes for B. bassiana is presented.Our results indicate several genes and molecular processes that may underpin virulence towards mosquitoes. Thus, the genome sequences of five isolates of B. bassiana provide a better understanding of the natural variation in virulence and will offer a major resource for future research on this important biological control agent.


July 7, 2019

Genome sequence of a commensal bacterium, Enterococcus faecalis CBA7120, isolated from a Korean fecal sample.

Enterococcus faecalis, the type strain of the genus Enterococcus, is not only a commensal bacterium in the gastrointestinal tract in vertebrates and invertebrates, but also causes serious disease as an opportunistic pathogen. To date, genome sequences have been published for over four hundred E. faecalis strains; however, pathogenicity of these microbes remains complicated. To increase our knowledge of E. faecalis virulence factors, we isolated strain CBA7120 from the feces of an 81-year-old female from the Republic of Korea and performed a comparative genomic analysis.The genome sequence of E. faecalis CBA7120 is 3,134,087 bp in length, with a G + C content of 37.35 mol%, and is comprised of four contigs with an N50 value of 2,922,046 bp. The genome showed high similarity with other strains of E. faecalis, including OG1RF, T13, 12107 and T20, based on OrthoANI values. Strain CBA7120 contains 374 pan-genome orthologous groups (POGs) as singletons, including “Phages, Prophages, Transposable elements, Plasmids,” “Carbohydrates,” “DNA metabolism,” and “Virulence, Disease and Defense” subsystems. Genes related to multidrug resistance efflux pumps were annotated in the genome.The comparative genomic analysis of E. faecalis strains presented in this study was performed using a variety of analysis methods and will facilitate future identification of hypothetical proteins.


July 7, 2019

Use of single molecule sequencing for comparative genomics of an environmental and a clinical isolate of Clostridium difficile ribotype 078.

How the pathogen Clostridium difficile might survive, evolve and be transferred between reservoirs within the natural environment is poorly understood. Some ribotypes are found both in clinical and environmental settings. Whether these strains are distinct from each another and evolve in the specific environments is not established. The possession of a highly mobile genome has contributed to the genetic diversity and ongoing evolution of C. difficile. Interpretations of genetic diversity have been limited by fragmented assemblies resulting from short-read length sequencing approaches and by a limited understanding of epigenetic regulation of diversity. To address this, single molecule real time (SMRT) sequencing was used in this study as it produces high quality genome sequences, with resolution of repeat regions (including those found in mobile elements) and can generate data to determine methylation modifications across the sequence (the methylome).Chromosomal rearrangements and ribosomal operon duplications were observed in both genomes. The rearrangements occurred at insertion sites within two mobile genetic elements (MGEs), Tn6164 and Tn6293, present only in the M120 and CD105HS27 genomes, respectively. The gene content of these two transposons differ considerably which could impact upon horizontal gene transfer; differences include CDSs encoding methylases and a conjugative prophage only in Tn6164. To investigate mechanisms which could affect MGE transfer, the methylome, restriction modification (RM)  and the CRISPR/Cas systems were characterised for each strain. Notably, the environmental isolate, CD105HS27, does not share a consensus motif for (m4)C methylation, but has one additional spacer  when compared to the clinical isolate M120.These findings show key differences between the two strains in terms of their genetic capacity for MGE transfer. The carriage of horizontally transferred genes appear to have genome wide effects based on two different methylation patterns. The CRISPR/Cas system appears active although perhaps slow to evolve. Data suggests that both mechanisms are functional and impact upon horizontal gene transfer and genome evolution within C. difficile.


July 7, 2019

The draft genome of whitefly Bemisia tabaci MEAM1, a global crop pest, provides novel insights into virus transmission, host adaptation, and insecticide resistance.

The whitefly Bemisia tabaci (Hemiptera: Aleyrodidae) is among the 100 worst invasive species in the world. As one of the most important crop pests and virus vectors, B. tabaci causes substantial crop losses and poses a serious threat to global food security. We report the 615-Mb high-quality genome sequence of B. tabaci Middle East-Asia Minor 1 (MEAM1), the first genome sequence in the Aleyrodidae family, which contains 15,664 protein-coding genes. The B. tabaci genome is highly divergent from other sequenced hemipteran genomes, sharing no detectable synteny. A number of known detoxification gene families, including cytochrome P450s and UDP-glucuronosyltransferases, are significantly expanded in B. tabaci. Other expanded gene families, including cathepsins, large clusters of tandemly duplicated B. tabaci-specific genes, and phosphatidylethanolamine-binding proteins (PEBPs), were found to be associated with virus acquisition and transmission and/or insecticide resistance, likely contributing to the global invasiveness and efficient virus transmission capacity of B. tabaci. The presence of 142 horizontally transferred genes from bacteria or fungi in the B. tabaci genome, including genes encoding hopanoid/sterol synthesis and xenobiotic detoxification enzymes that are not present in other insects, offers novel insights into the unique biological adaptations of this insect such as polyphagy and insecticide resistance. Interestingly, two adjacent bacterial pantothenate biosynthesis genes, panB and panC, have been co-transferred into B. tabaci and fused into a single gene that has acquired introns during its evolution.The B. tabaci genome contains numerous genetic novelties, including expansions in gene families associated with insecticide resistance, detoxification and virus transmission, as well as numerous horizontally transferred genes from bacteria and fungi. We believe these novelties likely have shaped B. tabaci as a highly invasive polyphagous crop pest and efficient vector of plant viruses. The genome serves as a reference for resolving the B. tabaci cryptic species complex, understanding fundamental biological novelties, and providing valuable genetic information to assist the development of novel strategies for controlling whiteflies and the viruses they transmit.


July 7, 2019

Complete genome sequence of Rothia aeria type strain JCM 11412, isolated from air in the Russian space laboratory Mir.

Here, we present the complete genome sequence of Rothia aeria type strain JCM 11412, isolated from air in the Russian space laboratory Mir. Recently, there has been an increasing number of reports on infections caused by R. aeria The genomic information will enable researchers to identify the pathogenicity of this organism. Copyright © 2016 Nambu et al.


July 7, 2019

Complete genome sequence of Streptococcus sp. strain NPS 308.

Streptococcus sp. strain NPS 308, isolated from an 8-year-old girl diagnosed with infective endocarditis, likely presents a novel species of Streptococcus Here, we present a complete genome sequence of this species, which will contribute to better understanding of the pathogenesis of infective endocarditis. Copyright © 2016 Kondo et al.


July 7, 2019

Whole genome sequence and comparative genomics of the novel Lyme borreliosis causing pathogen, Borrelia mayonii.

Borrelia mayonii, a Borrelia burgdorferi sensu lato (Bbsl) genospecies, was recently identified as a cause of Lyme borreliosis (LB) among patients from the upper midwestern United States. By microscopy and PCR, spirochete/genome loads in infected patients were estimated at 105 to 106 per milliliter of blood. Here, we present the full chromosome and plasmid sequences of two B. mayonii isolates, MN14-1420 and MN14-1539, cultured from blood of two of these patients. Whole genome sequencing and assembly was conducted using PacBio long read sequencing (Pacific Biosciences RSII instrument) followed by hierarchical genome-assembly process (HGAP). The B. mayonii genome is ~1.31 Mbp in size (26.9% average GC content) and is comprised of a linear chromosome, 8 linear and 7 circular plasmids. Consistent with its taxonomic designation as a new Bbsl genospecies, the B. mayonii linear chromosome shares only 93.83% average nucleotide identity with other genospecies. Both B. mayonii genomes contain plasmids similar to B. burgdorferi sensu stricto lp54, lp36, lp28-3, lp28-4, lp25, lp17, lp5, 5 cp32s, cp26, and cp9. The vls locus present on lp28-10 of B. mayonii MN14-1420 is remarkably long, being comprised of 24 silent vls cassettes. Genetic differences between the two B. mayonii genomes are limited and include 15 single nucleotide variations as well as 7 fewer silent vls cassettes and a lack of the lp5 plasmid in MN14-1539. Notably, 68 homologs to proteins present in B. burgdorferi sensu stricto appear to be lacking from the B. mayonii genomes. These include the complement inhibitor, CspZ (BB_H06), the fibronectin binding protein, BB_K32, as well as multiple lipoproteins and proteins of unknown function. This study shows the utility of long read sequencing for full genome assembly of Bbsl genomes, identifies putative genome regions of B. mayonii that may be linked to clinical manifestation or tissue tropism, and provides a valuable resource for pathogenicity, diagnostic and vaccine studies.


July 7, 2019

Genome sequence and comparative pathogenic determinants of multidrug resistant uropathogenic Escherichia coli O25b: H4, A clinical isolate from Saudi Arabia

Escherichia coli serotype O25b:H4 is involved in human urinary tract infections.In this study, we sequenced and analyzed E. coli O25b:H4 isolated from a patient sufferingfrom recurring UTI infections in an intensive care unit at Hera General Hospital inMakkah, Saudi Arabia. We aimed to determine the virulence genes for pathogenesis anddrug resistance of this isolate compared to other E. coli strains. We sequenced and analyzedthe E. coli O25b:H4 Saudi strain clinical isolate using next generation sequencing. Usingthe ERGO genome analysis platform, we performed annotations and identified virulenceand antibiotic resistance determinants of this clinical isolate. The E. coli O25b:H4 genomewas assembled into four contigs representing a total chromosome size of 5.28 Mb, andthree contigs were identified, including a 130.9 kb (virulence plasmid) contig bearing thebla-CTX gene and 32 kb and 29 kb contigs. In comparing this genome to otheruropathogenic E. coli genomes, we identified unique drug resistance and pathogenicityfactors. In this work, whole-genome sequencing and targeted comparative analysis of aclinical isolate of uropathogenic Escherichia coli O25b:H4 was performed. This strainencodes virulence genes linked with extraintestinal pathogenic E. coli (ExPEC) that areexpressed constitutively in E. coli ST131. We identified the genes responsible forpathogenesis and drug resistance and performed comparative analyses of the virulenceand antibiotic resistance determinants with those of other E. coli UPEC isolates. This isthe first report of genome sequencing and analysis of a UPEC strain from Saudi Arabia.


July 7, 2019

Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads

With next-generation sequencing (NGS) technologies, the life sciences face a deluge of raw data. Classical analysis processes for such data often begin with an assembly step, needing large amounts of computing resources, and potentially removing or modifying parts of the biological information contained in the data. Our approach proposes to focus directly on biological questions, by considering raw unassembled NGS data, through a suite of six command-line tools.


July 7, 2019

STR-realigner: a realignment method for short tandem repeat regions.

In the estimation of repeat numbers in a short tandem repeat (STR) region from high-throughput sequencing data, two types of strategies are mainly taken: a strategy based on counting repeat patterns included in sequence reads spanning the region and a strategy based on estimating the difference between the actual insert size and the insert size inferred from paired-end reads. The quality of sequence alignment is crucial, especially in the former approaches although usual alignment methods have difficulty in STR regions due to insertions and deletions caused by the variations of repeat numbers.We proposed a new dynamic programming based realignment method named STR-realigner that considers repeat patterns in STR regions as prior knowledge. By allowing the size change of repeat patterns with low penalty in STR regions, accurate realignment is expected. For the performance evaluation, publicly available STR variant calling tools were applied to three types of aligned reads: synthetically generated sequencing reads aligned with BWA-MEM, those realigned with STR-realigner, those realigned with ReviSTER, and those realigned with GATK IndelRealigner. From the comparison of root mean squared errors between estimated and true STR region size, the results for the dataset realigned with STR-realigner are better than those for other cases. For real data analysis, we used a real sequencing dataset from Illumina HiSeq 2000 for a parent-offspring trio. RepeatSeq and lobSTR were applied to the sequence reads for these individuals aligned with BWA-MEM, those realigned with STR-realigner, ReviSTER, and GATK IndelRealigner. STR-realigner shows the best performance in terms of consistency of the size of estimated STR regions in Mendelian inheritance. Root mean squared error values were also calculated from the comparison of these estimated results with STR region sizes obtained from high coverage PacBio sequencing data, and the results from the realigned sequencing data with STR-realigner showed the least (the best) root mean squared error value.The effectiveness of the proposed realignment method for STR regions was verified from the comparison with an existing method on both simulation datasets and real whole genome sequencing dataset.


July 7, 2019

First complete genome sequence of a subdivision 6 Acidobacterium strain.

Although ubiquitous and abundant in soils, acidobacteria have mostly escaped isolation and remain poorly investigated. Only a few cultured representatives and just eight genomes of subdivisions 1, 3, and 4 are available to date. Here, we determined the complete genome sequence of strain HEG_-6_39, the first genome of Acidobacterium subdivision 6. Copyright © 2016 Huang et al.


July 7, 2019

Improved assembly of noisy long reads by k-mer validation.

Genome assembly depends critically on read length. Two recent technologies, from Pacific Biosciences (PacBio) and Oxford Nanopore, produce read lengths >20 kb, which yield de novo genome assemblies with vastly greater contiguity than those based on Sanger, Illumina, or other technologies. However, the very high error rates of these two new technologies (~15% per base) makes assembly imprecise at repeats longer than the read length and computationally expensive. Here we show that the contiguity and quality of the assembly of these noisy long reads can be significantly improved at a minimal cost, by leveraging on the low error rate and low cost of Illumina short reads. Namely, k-mers from the PacBio raw reads that are not present in Illumina reads (which account for ~95% of the distinct k-mers) are deemed sequencing errors and ignored at the seed alignment step. By focusing on the ~5% of k-mers that are error free, read overlap sensitivity is dramatically increased. Of equal importance, the validation procedure can be extended to exclude repetitive k-mers, which prevents read miscorrection at repeats and further improves the resulting assemblies. We tested the k-mer validation procedure using one long-read technology (PacBio) and one assembler (MHAP/Celera Assembler), but it is very likely to yield analogous improvements with alternative long-read technologies and assemblers, such as Oxford Nanopore and BLASR/DALIGNER/Falcon, respectively.© 2016 Carvalho et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Jabba: hybrid error correction for long sequencing reads.

Third generation sequencing platforms produce longer reads with higher error rates than second generation technologies. While the improved read length can provide useful information for downstream analysis, underlying algorithms are challenged by the high error rate. Error correction methods in which accurate short reads are used to correct noisy long reads appear to be attractive to generate high-quality long reads. Methods that align short reads to long reads do not optimally use the information contained in the second generation data, and suffer from large runtimes. Recently, a new hybrid error correcting method has been proposed, where the second generation data is first assembled into a de Bruijn graph, on which the long reads are then aligned.In this context we present Jabba, a hybrid method to correct long third generation reads by mapping them on a corrected de Bruijn graph that was constructed from second generation data. Unique to our method is the use of a pseudo alignment approach with a seed-and-extend methodology, using maximal exact matches (MEMs) as seeds. In addition to benchmark results, certain theoretical results concerning the possibilities and limitations of the use of MEMs in the context of third generation reads are presented.Jabba produces highly reliable corrected reads: almost all corrected reads align to the reference, and these alignments have a very high identity. Many of the aligned reads are error-free. Additionally, Jabba corrects reads using a very low amount of CPU time. From this we conclude that pseudo alignment with MEMs is a fast and reliable method to map long highly erroneous sequences on a de Bruijn graph.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.