Menu
July 7, 2019

Exploiting next-generation sequencing to solve the haplotyping puzzle in polyploids: a simulation study.

Haplotypes are the units of inheritance in an organism, and many genetic analyses depend on their precise determination. Methods for haplotyping single individuals use the phasing information available in next-generation sequencing reads, by matching overlapping single-nucleotide polymorphisms while penalizing post hoc nucleotide corrections made. Haplotyping diploids is relatively easy, but the complexity of the problem increases drastically for polyploid genomes, which are found in both model organisms and in economically relevant plant and animal species. Although a number of tools are available for haplotyping polyploids, the effects of the genomic makeup and the sequencing strategy followed on the accuracy of these methods have hitherto not been thoroughly evaluated.We developed the simulation pipeline haplosim to evaluate the performance of three haplotype estimation algorithms for polyploids: HapCompass, HapTree and SDhaP, in settings varying in sequencing approach, ploidy levels and genomic diversity, using tetraploid potato as the model. Our results show that sequencing depth is the major determinant of haplotype estimation quality, that 1?kb PacBio circular consensus sequencing reads and Illumina reads with large insert-sizes are competitive and that all methods fail to produce good haplotypes when ploidy levels increase. Comparing the three methods, HapTree produces the most accurate estimates, but also consumes the most resources. There is clearly room for improvement in polyploid haplotyping algorithms.


July 7, 2019

Microbial sequence typing in the genomic era.

Next-generation sequencing (NGS), also known as high-throughput sequencing, is changing the field of microbial genomics research. NGS allows for a more comprehensive analysis of the diversity, structure and composition of microbial genes and genomes compared to the traditional automated Sanger capillary sequencing at a lower cost. NGS strategies have expanded the versatility of standard and widely used typing approaches based on nucleotide variation in several hundred DNA sequences and a few gene fragments (MLST, MLVA, rMLST and cgMLST). NGS can now accommodate variation in thousands or millions of sequences from selected amplicons to full genomes (WGS, NGMLST and HiMLST). To extract signals from high-dimensional NGS data and make valid statistical inferences, novel analytic and statistical techniques are needed. In this review, we describe standard and new approaches for microbial sequence typing at gene and genome levels and guidelines for subsequent analysis, including methods and computational frameworks. We also present several applications of these approaches to some disciplines, namely genotyping, phylogenetics and molecular epidemiology. Copyright © 2017 Elsevier B.V. All rights reserved.


July 7, 2019

Comparative genomic analysis of Lactobacillus plantarum GB-LP4 and identification of evolutionarily divergent genes in high-osmolarity environment.

Lactobacillus plantarum is one of the widely-used probiotics and there have been a large number of advanced researches on the effectiveness of this species. However, the difference between previously reported plantarum strains, and the source of genomic variation among the strains were not clearly specified. In order to understand further on the molecular basis of L. plantarum on Korean traditional fermentation, we isolated the L. plantarum GB-LP4 from Korean fermented vegetable and conducted whole genome assembly. With comparative genomics approach, we identified the candidate genes that are expected to have undergone evolutionary acceleration. These genes have been reported to associate with the maintaining homeostasis, which are generally known to overcome instability in external environment including low pH or high osmotic pressure. Here, our results provide an evolutionary relationship between L. plantarum species and elucidate the candidate genes that play a pivotal role in evolutionary acceleration of GB-LP4 in high osmolarity environment. This study may provide guidance for further studies on L. plantarum.


July 7, 2019

Complete genome sequence of Streptomyces formicae KY5, the formicamycin producer.

Here we report the complete genome of the new species Streptomyces formicae KY5 isolated from Tetraponera fungus growing ants. S. formicae was sequenced using the PacBio and 454 platforms to generate a single linear chromosome with terminal inverted repeats. Illumina MiSeq sequencing was used to correct base changes resulting from the high error rate associated with PacBio. The genome is 9.6 Mbps, has a GC content of 71.38% and contains 8162 protein coding sequences. Predictive analysis shows this strain encodes at least 45 gene clusters for the biosynthesis of secondary metabolites, including a type 2 polyketide synthase encoding cluster for the antibacterial formicamycins. Streptomyces formicae KY5 is a new, taxonomically distinct Streptomyces species and this complete genome sequence provides an important marker in the genus of Streptomyces. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.


July 7, 2019

RepLong: de novo repeat identification using long read sequencing data.

The identification of repetitive elements is important in genome assembly and phylogenetic analyses. The existing de novo repeat identification methods exploiting the use of short reads are impotent in identifying long repeats. Since long reads are more likely to cover repeat regions completely, using long reads is more favorable for recognizing long repeats.In this study, we propose a novel de novo repeat elements identification method namely RepLong based on PacBio long reads. Given that the reads mapped to the repeat regions are highly overlapped with each other, the identification of repeat elements is equivalent to the discovery of consensus overlaps between reads, which can be further cast into a community detection problem in the network of read overlaps. In RepLong, we first construct a network of read overlaps based on pair-wise alignment of the reads, where each vertex indicates a read and an edge indicates a substantial overlap between the corresponding two reads. Secondly, the communities whose intra connectivity is greater than the inter connectivity are extracted based on network modularity optimization. Finally, representative reads in each community are extracted to form the repeat library. Comparison studies on Drosophila melanogaster and human long read sequencing data with genome-based and short-read-based methods demonstrate the efficiency of RepLong in identifying long repeats. RepLong can handle lower coverage data and serve as a complementary solution to the existing methods to promote the repeat identification performance on long-read sequencing data.The software of RepLong is freely available at https://github.com/ruiguo-bio/replong.ywsun@szu.edu.cn or zhuzx@szu.edu.cn.Supplementary data are available at Bioinformatics online.


July 7, 2019

High-quality complete and draft genome sequences for three Escherichia spp. and three Shigella spp. generated with Pacific Biosciences and Illumina sequencing and optical mapping.

Escherichia spp., including E. albertii and E. coli, Shigella dysenteriae, and S. flexneri are causative agents of foodborne disease. We report here reference-level whole-genome sequences of E. albertii (2014C-4356), E. coli (2011C-4315 and 2012C-4431), S. dysenteriae (BU53M1), and S. flexneri (94-3007 and 71-2783).. Copyright © 2018 Schroeder et al.


July 7, 2019

Draft genome sequence of Cyanobacterium sp. strain HL-69, isolated from a benthic microbial mat from a magnesium sulfate-dominated hypersaline lake.

The complete genome sequence ofCyanobacteriumsp. strain HL-69 consists of 3,155,247 bp and contains 2,897 predicted genes comprising a chromosome and two plasmids. The genome is consistent with a halophilic nondiazotrophic phototrophic lifestyle, and this organism is able to synthesize most B vitamins and produces several secondary metabolites. Copyright © 2018 Mobberley et al.


July 7, 2019

Natural rubber and the Russian dandelion genome

The world needs rubber. Rubber is crucial for the tires on the cars, trucks and airplanes that propel modern transportation. It is equally important for daily tasks: latex gloves in the lab, balloons in angioplasty and wetsuits that warm a cold dip in the ocean. Rubber can be made synthetically from petroleum derivatives, but synthetic rubber is not as strong as rubber iso- lated from plants. The principal plant source for natural rubber (NR) is the sap of the Par´ a tree (Hevea brasiliensis), which is grown throughout Southeast Asia. Unfortunately, the produc- tion capacity of the Par´ a tree is limited by the availability of suitable land and by labor-intensive harvesting methods. The sustainability of the Par´ a crop is also constrained by its narrow genetic base, which may make the crop susceptible to disease.


July 7, 2019

Complete genome sequence of the marine Rhodococcus sp. H-CA8f isolated from Comau fjord in Northern Patagonia, Chile

Rhodococcus sp. H-CA8f was isolated from marine sediments obtained from the Comau fjord, located in Northern Chilean Patagonia. Whole-genome sequencing was achieved using PacBio RS II platform, comprising one closed, complete chromosome of 6,19?Mbp with a 62.45% G?+?C content. The chromosome harbours several metabolic pathways providing a wide catabolic potential, where the upper biphenyl route is described. Also, Rhodococcus sp. H-CA8f bears one linear mega-plasmid of 301?Kbp and 62.34% of G?+?C content, where genomic analyses demonstrated that it is constituted mostly by putative ORFs with unknown functions, representing a novel genetic feature. These genetic characteristics provide relevant insights regarding Chilean marine actinobacterial strains.


July 7, 2019

Host genetic variation strongly influences the microbiome structure and function in fungal fruiting-bodies.

Despite increasing knowledge on host-associated microbiomes, little is known about mechanisms underlying fungus-microbiome interactions. This study aimed to examine the relative importance of host genetic, geographic and environmental variations in structuring fungus-associated microbiomes. We analyzed the taxonomic composition and function of microbiomes inhabiting fungal fruiting-bodies in relation to host genetic variation, soil pH and geographic distance between samples. For this, we sequenced the metagenomes of 40 fruiting-bodies collected from six fairy rings (i.e., genets) of a saprotrophic fungus Marasmius oreades. Our analyses revealed that fine genetic variations between host fungi could strongly affect their associated microbiome, explaining, respectively, 25% and 37% of the variation in microbiome structure and function, whereas geographic distance and soil pH remained of secondary importance. These results, together with the smaller genome size of fungi compared to other eukaryotes, suggest that fruiting-bodies are suitable for further genome-centric studies on host-microbiome interactions.© 2018 Society for Applied Microbiology and John Wiley & Sons Ltd.


July 7, 2019

Complete genome sequence of multiple-antibiotic-resistant Streptococcus parauberis strain SPOF3K, isolated from diseased olive flounder (Paralichthys olivaceus).

Here, we report the complete genome sequence of multiple-antibiotic-resistant Streptococcus parauberis strain SPOF3K, isolated from the kidney of a diseased olive flounder in South Korea in 2013. Sequencing using a PacBio platform yielded a circular chromosome of 2,128,740?bp and a plasmid of 23,538?bp, harboring 2,123 and 24 protein-coding genes, respectively. Copyright © 2018 Lee et al.


July 7, 2019

Strategies for high-altitude adaptation revealed from high-quality draft genome of non-violacein producing Janthinobacterium lividum ERGS5:01.

A light pink coloured bacterial strain ERGS5:01 isolated from glacial stream water of Sikkim Himalaya was affiliated to Janthinobacterium lividum based on 16S rRNA gene sequence identity and phylogenetic clustering. Whole genome sequencing was performed for the strain to confirm its taxonomy as it lacked the typical violet pigmentation of the genus and also to decipher its survival strategy at the aquatic ecosystem of high elevation. The PacBio RSII sequencing generated genome of 5,168,928 bp with 4575 protein-coding genes and 118 RNA genes. Whole genome-based multilocus sequence analysis clustering, in silico DDH similarity value of 95.1% and, the ANI value of 99.25% established the identity of the strain ERGS5:01 (MCC 2953) as a non-violacein producing J. lividum. The genome comparisons across genus Janthinobacterium revealed an open pan-genome with the scope of the addition of new orthologous cluster to complete the genomic inventory. The genomic insight provided the genetic basis of freezing and frequent freeze-thaw cycle tolerance and, for industrially important enzymes. Extended insight into the genome provided clues of crucial genes associated with adaptation in the harsh aquatic ecosystem of high altitude.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.