Menu
July 7, 2019

NanoPack: visualizing and processing long-read sequencing data.

Here we describe NanoPack, a set of tools developed for visualization and processing of long-read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences.The NanoPack tools are written in Python3 and released under the GNU GPL3.0 License. The source code can be found at https://github.com/wdecoster/nanopack, together with links to separate scripts and their documentation. The scripts are compatible with Linux, Mac OS and the MS Windows 10 subsystem for Linux and are available as a graphical user interface, a web service at http://nanoplot.bioinf.be and command line tools.Supplementary data are available at Bioinformatics online.


July 7, 2019

Assembly, annotation, and comparative genomics in PATRIC, the All Bacterial Bioinformatics Resource Center.

In the “big data” era, research biologists are faced with analyzing new types that usually require some level of computational expertise. A number of programs and pipelines exist, but acquiring the expertise to run them, and then understanding the output can be a challenge.The Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org ) has created an end-to-end analysis platform that allows researchers to take their raw reads, assemble a genome, annotate it, and then use a suite of user-friendly tools to compare it to any public data that is available in the repository. With close to 113,000 bacterial and more than 1000 archaeal genomes, PATRIC creates a unique research experience with “virtual integration” of private and public data. PATRIC contains many diverse tools and functionalities to explore both genome-scale and gene expression data, but the main focus of this chapter is on assembly, annotation, and the downstream comparative analysis functionality that is freely available in the resource.


July 7, 2019

Complete genome sequence of Streptomyces formicae KY5, the formicamycin producer.

Here we report the complete genome of the new species Streptomyces formicae KY5 isolated from Tetraponera fungus growing ants. S. formicae was sequenced using the PacBio and 454 platforms to generate a single linear chromosome with terminal inverted repeats. Illumina MiSeq sequencing was used to correct base changes resulting from the high error rate associated with PacBio. The genome is 9.6 Mbps, has a GC content of 71.38% and contains 8162 protein coding sequences. Predictive analysis shows this strain encodes at least 45 gene clusters for the biosynthesis of secondary metabolites, including a type 2 polyketide synthase encoding cluster for the antibacterial formicamycins. Streptomyces formicae KY5 is a new, taxonomically distinct Streptomyces species and this complete genome sequence provides an important marker in the genus of Streptomyces. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.


July 7, 2019

Complete genome sequence of Planococcus faecalis AJ003T, the type species of the genus Planococcus and a microbial C30 carotenoid producer.

A novel type strain, Planococcus faecalis AJ003T, isolated from the feces of Antarctic penguins, synthesizes a rare C30 carotenoid, glycosyl-4,4′-diaponeurosporen-4′-ol-4-oic acid. The complete genome of P. faecalis AJ003Tcomprises a single circular chromosome (3,495,892?bp; 40.9% G?+?C content). Annotation analysis has revealed 3511 coding DNA sequences and 99 RNAs; seven genes associated with the MEP pathway and five genes involved in the carotenoid pathway have been identified. The functionality and complementation of 4,4′-diapophytoene synthase (CrtM) and two copies of heterologous 4,4′-diapophytoene desaturase (CrtN) involved in carotenoid biosynthesis were analyzed in Escherichia coli. Copyright © 2017 Elsevier B.V. All rights reserved.


July 7, 2019

Complete genome sequence of Flavobacterium kingsejongi WV39, a type species of the genus Flavobacterium and a microbial C40 carotenoid zeaxanthin producer.

A novel species, Flavobacterium kingsejongi WV39, isolated from feces of Antarctic penguins and a type species of the genus Flavobacterium, is yellow because it synthesizes a C40 carotenoid zeaxanthin. The complete genome of F. kingsejongi WV39 is made up of a single circular chromosome (4,224,053bp, 39.8% G+C content). Annotation analysis revealed 3,955 coding sequences, 72 RNAs (18 rRNA+54 tRNA), and five genes involved in zeaxanthin biosynthesis. The key gene encoding ß-carotenoid hydroxylase (CrtZ), which is the last enzyme in the zeaxanthin biosynthetic pathway, was cloned and subjected to complementary analysis in a heterologous E. coli strain. The CrtZ of F. kingsejongi WV39 showed a higher activity than other reported CrtZs. Copyright © 2017 Elsevier B.V. All rights reserved.


July 7, 2019

RepLong: de novo repeat identification using long read sequencing data.

The identification of repetitive elements is important in genome assembly and phylogenetic analyses. The existing de novo repeat identification methods exploiting the use of short reads are impotent in identifying long repeats. Since long reads are more likely to cover repeat regions completely, using long reads is more favorable for recognizing long repeats.In this study, we propose a novel de novo repeat elements identification method namely RepLong based on PacBio long reads. Given that the reads mapped to the repeat regions are highly overlapped with each other, the identification of repeat elements is equivalent to the discovery of consensus overlaps between reads, which can be further cast into a community detection problem in the network of read overlaps. In RepLong, we first construct a network of read overlaps based on pair-wise alignment of the reads, where each vertex indicates a read and an edge indicates a substantial overlap between the corresponding two reads. Secondly, the communities whose intra connectivity is greater than the inter connectivity are extracted based on network modularity optimization. Finally, representative reads in each community are extracted to form the repeat library. Comparison studies on Drosophila melanogaster and human long read sequencing data with genome-based and short-read-based methods demonstrate the efficiency of RepLong in identifying long repeats. RepLong can handle lower coverage data and serve as a complementary solution to the existing methods to promote the repeat identification performance on long-read sequencing data.The software of RepLong is freely available at https://github.com/ruiguo-bio/replong.ywsun@szu.edu.cn or zhuzx@szu.edu.cn.Supplementary data are available at Bioinformatics online.


July 7, 2019

Enhancing the accuracy of next-generation sequencing for detecting rare and subclonal mutations.

Mutations, the fuel of evolution, are first manifested as rare DNA changes within a population of cells. Although next-generation sequencing (NGS) technologies have revolutionized the study of genomic variation between species and individual organisms, most have limited ability to accurately detect and quantify rare variants among the different genome copies in heterogeneous mixtures of cells or molecules. We describe the technical challenges in characterizing subclonal variants using conventional NGS protocols and the recent development of error correction strategies, both computational and experimental, including consensus sequencing of single DNA molecules. We also highlight major applications for low-frequency mutation detection in science and medicine, describe emerging methodologies and provide our vision for the future of DNA sequencing.


July 7, 2019

Lepidoptera genomes: current knowledge, gaps and future directions.

Butterflies and moths (Lepidoptera) are one of the most ecologically diverse and speciose insect orders. With recent advances in genomics, new Lepidoptera genomes are regularly being sequenced, and many of them are playing principal roles in genomics studies, particularly in the fields of phylo-genomics and functional genomics. Thus far, assembled genomes are only available for <10 of the 43 Lepidoptera superfamilies. Nearly all are model species, found in the speciose clade Ditrysia. Community support for Lepidoptera genomics is growing with successful management and dissemination of data and analytical tools in centralized databases. With genomic studies quickly becoming integrated with ecological and evolutionary research, the Lepidoptera community will unquestionably benefit from new high-quality reference genomes that are more evenly distributed throughout the order. Copyright © 2018 Elsevier Inc. All rights reserved.


July 7, 2019

Construction of two whole genome radiation hybrid panels for dromedary (Camelus dromedarius): 5000RAD and 15000RAD.

The availability of genomic resources including linkage information for camelids has been very limited. Here, we describe the construction of a set of two radiation hybrid (RH) panels (5000RADand 15000RAD) for the dromedary (Camelus dromedarius) as a permanent genetic resource for camel genome researchers worldwide. For the 5000RADpanel, a total of 245 female camel-hamster radiation hybrid clones were collected, of which 186 were screened with 44 custom designed marker loci distributed throughout camel genome. The overall mean retention frequency (RF) of the final set of 93 hybrids was 47.7%. For the 15000RADpanel, 238 male dromedary-hamster radiation hybrid clones were collected, of which 93 were tested using 44 PCR markers. The final set of 90 clones had a mean RF of 39.9%. This 15000RADpanel is an important high-resolution complement to the main 5000RADpanel and an indispensable tool for resolving complex genomic regions. This valuable genetic resource of dromedary RH panels is expected to be instrumental for constructing a high resolution camel genome map. Construction of the set of RH panels is essential step toward chromosome level reference quality genome assembly that is critical for advancing camelid genomics and the development of custom genomic tools.


July 7, 2019

Current advances in genome sequencing of common wheat and its ancestral species

Common wheat is an important and widely cultivated food crop throughout the world. Much progress has been made in regard to wheat genome sequencing in the last decade. Starting from the sequencing of single chromosomes/chromosome arms whole genome sequences of common wheat and its diploid and tetraploid ancestors have been decoded along with the development of sequencing and assembling technologies. In this review, we give a brief summary on international progress in wheat genome sequencing, and mainly focus on reviewing the effort and contributions made by Chinese scientists.


July 7, 2019

Inferring synteny between genome assemblies: a systematic evaluation.

Genome assemblies across all domains of life are being produced routinely. Initial analysis of a new genome usually includes annotation and comparative genomics. Synteny provides a framework in which conservation of homologous genes and gene order is identified between genomes of different species. The availability of human and mouse genomes paved the way for algorithm development in large-scale synteny mapping, which eventually became an integral part of comparative genomics. Synteny analysis is regularly performed on assembled sequences that are fragmented, neglecting the fact that most methods were developed using complete genomes. It is unknown to what extent draft assemblies lead to errors in such analysis.We fragmented genome assemblies of model nematodes to various extents and conducted synteny identification and downstream analysis. We first show that synteny between species can be underestimated up to 40% and find disagreements between popular tools that infer synteny blocks. This inconsistency and further demonstration of erroneous gene ontology enrichment tests raise questions about the robustness of previous synteny analysis when gold standard genome sequences remain limited. In addition, assembly scaffolding using a reference guided approach with a closely related species may result in chimeric scaffolds with inflated assembly metrics if a true evolutionary relationship was overlooked. Annotation quality, however, has minimal effect on synteny if the assembled genome is highly contiguous.Our results show that a minimum N50 of 1 Mb is required for robust downstream synteny analysis, which emphasizes the importance of gold standard genomes to the science community, and should be achieved given the current progress in sequencing technology.


July 7, 2019

Complete genome sequence of the halophilic methylotrophic methanogen archaeon Methanohalophilus portucalensis strain FDF-1T.

We report here the complete genome sequence (2.08 Mb) of Methanohalophilus portucalensis strain FDF-1T, a halophilic methylotrophic methanogen isolated from the sediment of a saltern in Figeria da Foz, Portugal. The average nucleotide identity and DNA-DNA hybridization analyses show that Methanohalophilus mahii, M. halophilus, and M. portucalensis are three different species within the Methanosarcinaceae family.


July 7, 2019

Complete genome sequence of Chryseobacterium camelliae Dolsongi-HT1, a green tea isolate with keratinolytic activity.

The complete genome sequence of Chryseobacterium camelliae Dolsongi-HT1 is reported here. C. camelliae Dolsongi-HT1, having keratinolytic activity, was isolated from green tea leaves in the Dolsongi tea garden in Jeju, South Korea. The strain Dolsongi-HT1 has 28 candidate protease genes, which may be utilized in further studies and industrial applications of keratinase. Copyright © 2018 Kim et al.


July 7, 2019

Complete genome sequencing of Acinetobacter sp. strain LoGeW2-3, isolated from the pellet of a white stork, reveals a novel class D beta-lactamase gene.

Whole-genome sequencing ofAcinetobactersp. strain LoGeW2-3, isolated from the pellet of a white stork (Ciconia ciconia), reveals the presence of a plasmid of 179,399 bp encoding a CRISPR-Cas (clustered regularly interspaced short palindromic repeats and associated genes) system of the I-F type, and the chromosomally encoded novel class D beta-lactamase OXA-568. Copyright © 2018 Blaschke et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.