Menu
September 22, 2019

Progressive approach for SNP calling and haplotype assembly using single molecular sequencing data.

Haplotype information is essential to the complete description and interpretation of genomes, genetic diversity and genetic ancestry. The new technologies can provide Single Molecular Sequencing (SMS) data that cover about 90% of positions over chromosomes. However, the SMS data has a higher error rate comparing to 1% error rate for short reads. Thus, it becomes very difficult for SNP calling and haplotype assembly using SMS reads. Most existing technologies do not work properly for the SMS data.In this paper, we develop a progressive approach for SNP calling and haplotype assembly that works very well for the SMS data. Our method can handle more than 200 million non-N bases on Chromosome 1 with millions of reads, more than 100 blocks, each of which contains more than 2 million bases and more than 3K SNP sites on average. Experiment results show that the false discovery rate and false negative rate for our method are 15.7 and 11.0% on NA12878, and 16.5 and 11.0% on NA24385. Moreover, the overall switch errors for our method are 7.26 and 5.21 with average 3378 and 5736 SNP sites per block on NA12878 and NA24385, respectively. Here, we demonstrate that SMS reads alone can generate a high quality solution for both SNP calling and haplotype assembly.Source codes and results are available at https://github.com/guofeieileen/SMRT/wiki/Software.


September 22, 2019

Genome-based evolutionary history of Pseudomonas spp.

Pseudomonas is a large and diverse genus of Gammaproteobacteria. To provide a framework for discovery of evolutionary and taxonomic relationships of these bacteria, we compared the genomes of type strains of 163 species and 3 additional subspecies of Pseudomonas, including 118 genomes sequenced herein. A maximum likelihood phylogeny of the 166 type strains based on protein sequences of 100 single-copy orthologous genes revealed thirteen groups of Pseudomonas, composed of two to sixty three species each. Pairwise average nucleotide identities and alignment fractions were calculated for the data set of the 166 type strains and 1224 genomes of Pseudomonas available in public databases. Results revealed that 394 of the 1224 genomes were distinct from any type strain, suggesting that the type strains represent only a fraction of the genomic diversity of the genus. The core genome of Pseudomonas was determined to contain 794 genes conferring primarily housekeeping functions. The results of this study provide a phylogenetic framework for future studies aiming to resolve the classification and phylogenetic relationships, identify new gene functions and phenotypes, and explore the ecological and metabolic potential of the Pseudomonas spp.© 2018 Society for Applied Microbiology and John Wiley & Sons Ltd.


September 22, 2019

Pm21 from Haynaldia villosa encodes a CC-NBS-LRR protein conferring powdery mildew resistance in wheat.

Wheat powdery mildew, caused by Blumeria graminis f. sp. tritici (Bgt), is a destructive disease of wheat throughout the world. One of the most important environmental-friendly and economical methods to reduce wheat loss caused by Bgt is to develop highly resistant varieties (Kuraparthy et al., 2007). Pm21 from the wild species Haynaldia villosa (also known as Dasypyrum villosum) confers high resistance to Bgt in wheat throughout all growth stages. It has now become one of the most highly effective genetic loci introgressed into wheat from wild species, and the commercial varieties harboring Pm21 have been widely used in wheat production with more than 4 million hectares in China.


September 22, 2019

Sequence analysis of IncA/C and IncI1 plasmids isolated from multidrug-resistant Salmonella Newport using Single-Molecule Real-Time Sequencing.

Multidrug-resistant (MDR) plasmids play an important role in disseminating antimicrobial resistance genes. To elucidate the antimicrobial resistance gene compositions in A/C incompatibility complex (IncA/C) plasmids carried by animal-derived MDR Salmonella Newport, and to investigate the spread mechanism of IncA/C plasmids, this study characterizes the complete nucleotide sequences of IncA/C plasmids by comparative analysis. Complete nucleotide sequencing of plasmids and chromosomes of six MDR Salmonella Newport strains was performed using PacBio RSII. Open reading frames were assigned using prokaryotic genome annotation pipeline (PGAP). To understand genomic diversity and evolutionary relationships among Salmonella Newport IncA/C plasmids, we included three complete IncA/C plasmid sequences with similar backbones from Salmonella Newport and Escherichia coli: pSN254, pAM04528, and peH4H, and additional 200 draft chromosomes. With the exception of canine isolate CVM22462, which contained an additional IncI1 plasmid, each of the six MDR Salmonella Newport strains contained only the IncA/C plasmid. These IncA/C plasmids (including references) ranged in size from 80.1 (pCVM21538) to 176.5?kb (pSN254) and carried various resistance genes. Resistance genes floR, tetA, tetR, strA, strB, sul, and mer were identified in all IncA/C plasmids. Additionally, blaCMY-2 and sugE were present in all IncA/C plasmids, excepting pCVM21538. Plasmid pCVM22462 was capable of being transferred by conjugation. The IncI1 plasmid pCVM22462b in CVM22462 carried blaCMY-2 and sugE. Our data showed that MDR Salmonella Newport strains carrying similar IncA/C plasmids clustered together in the phylogenetic tree using chromosome sequences and the IncA/C plasmids from animal-derived Salmonella Newport contained diverse resistance genes. In the current study, we analyzed genomic diversities and phylogenetic relationships among MDR Salmonella Newport using complete plasmids and chromosome sequences and provided possible spread mechanism of IncA/C plasmids in Salmonella Newport Lineage II.


September 22, 2019

Catabolism of 2-hydroxypyridine by Burkholderia sp. MAK1: a five-gene cluster encoded 2-hydroxypyridine 5-monooxygenase HpdABCDE catalyses the first step of biodegradation.

Microbial degradation of 2-hydroxypyridine usually results in the formation of a blue pigment (nicotine blue). In contrast, the Burkholderia sp. strain MAK1 bacterium utilizes 2-hydroxypyridine without the accumulation of nicotine blue. This scarcely investigated degradation pathway presumably employs 2-hydroxypyridine 5-monooxygenase, an elusive enzyme that has been hypothesized but has yet to be identified or characterized. The isolation of the mutant strain Burkholderia sp. MAK1 ?P5 that is unable to utilize 2-hydroxypyridine has led to the identification of a gene cluster (designated hpd) which is responsible for the degradation of 2-hydroxypyridine. The activity of 2-hydroxypyridine 5-monooxygenase has been assigned to a soluble diiron monooxygenase (SDIMO) encoded by a five-gene cluster (hpdA, hpdB, hpdC, hpdD, and hpdE). A 4.5-kb DNA fragment containing all five genes has been successfully expressed in Burkholderia sp. MAK1 ?P5 cells. We have proved that the recombinant HpdABCDE protein catalyzes the enzymatic turnover of 2-hydroxypyridine to 2,5-dihydroxypyridine. Moreover, we have confirmed that emerging 2,5-dihydroxypyridine is a substrate for HpdF, an enzyme similar to 2,5-dihydroxypyridine 5,6-dioxygenases that are involved in the catabolic pathways of nicotine and nicotinic acid. The proteins and genes identified in this study have allowed the identification of a novel degradation pathway of 2-hydroxypyridine. Our results provide a better understanding of the biodegradation of pyridine derivatives in nature. Also, the discovered 2-hydroxypyridine 5-monooxygenase may be an attractive catalyst for the regioselective synthesis of various N-heterocyclic compounds.IMPORTANCE The degradation pathway of 2-hydroxypyridine without the accumulation of a blue pigment is relatively unexplored, as, to our knowledge, no genetic data related to this process have ever been presented. In this paper, we describe genes and enzymes involved in this little-studied catabolic pathway. This work provides new insights into the metabolism of 2-hydroxypyridine in nature. A broad-range substrate specificity of 2-hydroxypyridine 5-monooxygenase, a key enzyme in the degradation, makes this biocatalyst attractive for the regioselective hydroxylation of pyridine derivatives. Copyright © 2018 American Society for Microbiology.


September 22, 2019

Genetic diversity of Cryptosporidium hominis in a Bangladeshi community as revealed by whole genome sequencing.

We studied the genetic diversity of Cryptosporidium hominis infections in slum-dwelling infants from Dhaka over a 2-year period. Cryptosporidium hominis infections were common during the monsoon, and were genetically diverse as measured by gp60 genotyping and whole-genome resequencing. Recombination in the parasite was evidenced by the decay of linkage disequilibrium in the genome over <300 bp. Regions of the genome with high levels of polymorphism were also identified. Yet to be determined is if genomic diversity is responsible in part for the high rate of reinfection, seasonality, and varied clinical presentations of cryptosporidiosis in this population.


September 22, 2019

Mycobacterial biomaterials and resources for researchers.

There are many resources available to mycobacterial researchers, including culture collections around the world that distribute biomaterials to the general scientific community, genomic and clinical databases, and powerful bioinformatics tools. However, many of these resources may be unknown to the research community. This review article aims to summarize and publicize many of these resources, thus strengthening the quality and reproducibility of mycobacterial research by providing the scientific community access to authenticated and quality-controlled biomaterials and a wealth of information, analytical tools and research opportunities.


September 22, 2019

Unexpected invasion of miniature inverted-repeat transposable elements in viral genomes

Transposable elements (TEs) are common and often present with high copy numbers in cellular genomes. Unlike in cellular organisms, TEs were previously thought to be either rare or absent in viruses. Almost all reported TEs display only one or two copies per viral genome. In addition, the discovery of pandoraviruses with genomes up to 2.5-Mb emphasizes the need for biologists to rethink the fundamental nature of the relationship between viruses and cellular life.


September 22, 2019

Parallels between experimental and natural evolution of legume symbionts.

The emergence of symbiotic interactions has been studied using population genomics in nature and experimental evolution in the laboratory, but the parallels between these processes remain unknown. Here we compare the emergence of rhizobia after the horizontal transfer of a symbiotic plasmid in natural populations of Cupriavidus taiwanensis, over 10 MY ago, with the experimental evolution of symbiotic Ralstonia solanacearum for a few hundred generations. In spite of major differences in terms of time span, environment, genetic background, and phenotypic achievement, both processes resulted in rapid genetic diversification dominated by purifying selection. We observe no adaptation in the plasmid carrying the genes responsible for the ecological transition. Instead, adaptation was associated with positive selection in a set of genes that led to the co-option of the same quorum-sensing system in both processes. Our results provide evidence for similarities in experimental and natural evolutionary transitions and highlight the potential of comparisons between both processes to understand symbiogenesis.


September 22, 2019

A reference genome of the European beech (Fagus sylvatica L.).

The European beech is arguably the most important climax broad-leaved tree species in Central Europe, widely planted for its valuable wood. Here, we report the 542 Mb draft genome sequence of an up to 300-year-old individual (Bhaga) from an undisturbed stand in the Kellerwald-Edersee National Park in central Germany.Using a hybrid assembly approach, Illumina reads with short- and long-insert libraries, coupled with long Pacific Biosciences reads, we obtained an assembled genome size of 542 Mb, in line with flow cytometric genome size estimation. The largest scaffold was of 1.15 Mb, the N50 length was 145 kb, and the L50 count was 983. The assembly contained 0.12% of Ns. A Benchmarking with Universal Single-Copy Orthologs (BUSCO) analysis retrieved 94% complete BUSCO genes, well in the range of other high-quality draft genomes of trees. A total of 62,012 protein-coding genes were predicted, assisted by transcriptome sequencing. In addition, we are reporting an efficient method for extracting high-molecular-weight DNA from dormant buds, by which contamination by environmental bacteria and fungi was kept at a minimum.The assembled genome will be a valuable resource and reference for future population genomics studies on the evolution and past climate change adaptation of beech and will be helpful for identifying genes, e.g., involved in drought tolerance, in order to select and breed individuals to adapt forestry to climate change in Europe. A continuously updated genome browser and download page can be accessed from beechgenome.net, which will include future genome versions of the reference individual Bhaga, as new sequencing approaches develop.


September 22, 2019

Genome of an allotetraploid wild peanut Arachis monticola: a de novo assembly.

Arachis monticola (2n = 4x = 40) is the only allotetraploid wild peanut within the Arachis genus and section, with an AABB-type genome of ~2.7 Gb in size. The AA-type subgenome is derived from diploid wild peanut Arachis duranensis, and the BB-type subgenome is derived from diploid wild peanut Arachis ipaensis. A. monticola is regarded either as the direct progenitor of the cultivated peanut or as an introgressive derivative between the cultivated peanut and wild species. The large polyploidy genome structure and enormous nearly identical regions of the genome make the assembly of chromosomal pseudomolecules very challenging. Here we report the first reference quality assembly of the A. monticola genome, using a series of advanced technologies. The final whole genome of A. monticola is ~2.62 Gb and has a contig N50 and scaffold N50 of 106.66 Kb and 124.92 Mb, respectively. The vast majority (91.83%) of the assembled sequence was anchored onto the 20 pseudo-chromosomes, and 96.07% of assemblies were accurately separated into AA- and BB- subgenomes. We demonstrated efficiency of the current state of the strategy for de novo assembly of the highly complex allotetraploid species, wild peanut (A. monticola), based on whole-genome shotgun sequencing, single molecule real-time sequencing, high-throughput chromosome conformation capture technology, and BioNano optical genome maps. These combined technologies produced reference-quality genome of the allotetraploid wild peanut, which is valuable for understanding the peanut domestication and evolution within the Arachis genus and among legume crops.


September 22, 2019

Comparative genomics of Campylobacter concisus: Analysis of clinical strains reveals genome diversity and pathogenic potential.

In recent years, an increasing number of Campylobacter species have been associated with human gastrointestinal (GI) diseases including gastroenteritis, inflammatory bowel disease, and colorectal cancer. Campylobacter concisus, an oral commensal historically linked to gingivitis and periodontitis, has been increasingly detected in the lower GI tract. In the present study, we generated robust genome sequence data from C. concisus strains and undertook a comprehensive pangenome assessment to identify C. concisus virulence properties and to explain potential adaptations acquired while residing in specific ecological niche(s) of the GI tract. Genomes of 53 new C. concisus strains were sequenced, assembled, and annotated including 36 strains from gastroenteritis patients, 13 strains from Crohn’s disease patients and four strains from colitis patients (three collagenous colitis and one lymphocytic colitis). When compared with previous published sequences, strains clustered into two main groups/genomospecies (GS) with phylogenetic clustering explained neither by disease phenotype nor sample location. Paired oral/faecal isolates, from the same patient, indicated that there are few genetic differences between oral and gut isolates which suggests that gut isolates most likely reflect oral strain relocation. Type IV and VI secretion systems genes, genes known to be important for pathogenicity in the Campylobacter genus, were present in the genomes assemblies, with 82% containing Type VI secretion system genes. Our findings indicate that C. concisus strains are genetically diverse, and the variability in bacterial secretion system content may play an important role in their virulence potential.


September 22, 2019

Directed evolution of multiple genomic loci allows the prediction of antibiotic resistance.

Antibiotic development is frequently plagued by the rapid emergence of drug resistance. However, assessing the risk of resistance development in the preclinical stage is difficult. Standard laboratory evolution approaches explore only a small fraction of the sequence space and fail to identify exceedingly rare resistance mutations and combinations thereof. Therefore, new rapid and exhaustive methods are needed to accurately assess the potential of resistance evolution and uncover the underlying mutational mechanisms. Here, we introduce directed evolution with random genomic mutations (DIvERGE), a method that allows an up to million-fold increase in mutation rate along the full lengths of multiple predefined loci in a range of bacterial species. In a single day, DIvERGE generated specific mutation combinations, yielding clinically significant resistance against trimethoprim and ciprofloxacin. Many of these mutations have remained previously undetected or provide resistance in a species-specific manner. These results indicate pathogen-specific resistance mechanisms and the necessity of future narrow-spectrum antibacterial treatments. In contrast to prior claims, we detected the rapid emergence of resistance against gepotidacin, a novel antibiotic currently in clinical trials. Based on these properties, DIvERGE could be applicable to identify less resistance-prone antibiotics at an early stage of drug development. Finally, we discuss potential future applications of DIvERGE in synthetic and evolutionary biology. Copyright © 2018 the Author(s). Published by PNAS.


September 22, 2019

The complete genome sequence of Vibrio aestuarianus W-40 reveals virulence factor genes.

Vibrio aestuarianus is an opportunistic environmental pathogen that has been associated with epidemics in cultured shrimp Penaeus vannamei. Hepatopancreas microsporidian (HPM) and monodon slow growth syndrome (MSGS) have been reported in cultured P. vannamei. In this study, we sequenced and assembled the whole genome of V. aestuarianus strain W-40, a strain that was originally isolated from the intestines of an infected P. vannamei. The genome of V. aestuarianus strain W-40 contains two circular chromosomes of 483,7307 bp with a 46.23% GC content. We identified 4,457 open reading frames (ORFs) that occupy 86.35% of the genome. Vibrio aestuarianus strain W-40 consists primarily of the ATP-binding cassette (ABC) transporter system and the phosphotransferase system (PTS). CagA is a metabolism system that includes bacterial extracellular solute-binding protein. Glutathione reductase can purge superoxide radicals (O22-) and hydrogen peroxide (H2 O2 ) damage in V. aestuarianus strain W-40. The presence of two compete type I restriction-modification systems was confirmed. A total of 42 insertion sequences (IS) elements and 16 IS elements were identified. Our results revealed a host of virulence factors that likely contribute to the pathogenicity of V. aestuarianus strain W-40, including the virulence factor genes vacA, clpC, and bvgA, which are important for biofilm dispersion. Several bacitracin and tetracycline antibiotic resistance-encoding genes and type VI secretion systems were also identified in the genome. The complete genome sequence will aid future studies of the pathogenesis of V. aestuarianus strain W-40 and allow for new strategies to control disease to be developed.© 2018 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.


September 22, 2019

Adaptation of Pseudomonas aeruginosa to phage PaP1 predation via O-antigen polymerase mutation.

Adaptation of bacteria to phage predation poses a major obstacle for phage therapy. Bacteria adopt multiple mechanisms, such as inhibition of phage adsorption and CRISPR/Cas systems, to resist phage infection. Here, a phage-resistant mutant of Pseudomonas aeruginosa strain PA1 under the infection of lytic phage PaP1 was selected for further study. The PaP1-resistant variant, termed PA1RG, showed decreased adsorption to PaP1 and was devoid of long chain O-antigen on its cell envelope. Whole genome sequencing and comparative analysis revealed a single nucleotide mutation in the gene PA1S_08510, which encodes the O-antigen polymerase Wzy that is involved in lipopolysaccharide (LPS) biosynthesis. PA1_Wzy was classified into the O6 serotype based on sequence homology analysis and adopts a transmembrane topology similar to that seem with P. aeruginosa strain PAO1. Complementation of gene wzy in trans enabled the mutant PA1RG to produce the normal LPS pattern with long chain O-antigen and restored the susceptibility of PA1RG to phage PaP1 infection. While wzy mutation did not affect bacterial growth, mutant PA1RG exhibited decreased biofilm production, suggesting a fitness cost of PA1 associated with resistance of phage PaP1 predation. This study uncovered the mechanism responsible for PA1RG resistance to phage PaP1 via wzy mutation and revealed the role of phages in regulating bacterial behavior.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.