Menu
July 7, 2019

Susan Celniker: Foundational resources to study a dynamic genome.

The Genetics Society of America’s George W. Beadle Award honors individuals who have made outstanding contributions to the community of genetics researchers and who exemplify the qualities of its namesake. The 2016 recipient, Susan E. Celniker, played a key role in the sequencing, annotation, and characterization of the Drosophila genome. She participated in early sequencing efforts at the Lawrence Berkeley National Laboratory and led the modENCODE Fly Transcriptome Consortium. Her efforts were critical to ensuring that the Drosophila genome was well-annotated, making it one of the best curated animal genomes available. As the Principal Investigator for the BDGP, Celniker has enabled the study of proteomes by creating a collection of over 13,000 clones that match annotated genes for protein expression in cells or transgenic flies, and she has established the most comprehensive spatial gene expression atlas in any organism, with in situ imaging of more than 80% of the Drosophila protein-coding transcriptome through embryogenesis. In addition to providing the research community with these invaluable resources and reagents, she continues to develop new tools and datasets for genetics researchers to explore the spatial and temporal control of gene expression.


July 7, 2019

Complete genome sequence of a psychotrophic Pseudarthrobacter sulfonivorans strain Ar51 (CGMCC 4.7316), a novel crude oil and multi benzene compounds degradation strain.

Pseudarthrobacter sulfonivorans strain Ar51, a psychotrophic bacterium isolated from the Tibet permafrost of China, can degrade crude oil and multi benzene compounds efficiently in low temperature. Here we report the complete genome sequence of this bacterium. The complete genome sequence of Pseudarthrobacter sulfonivorans strain Ar51, consisting of a cycle chromosome with a size of 5.04Mbp and a cycle plasmid with a size of 12.39kbp. The availability of this genome sequence allows us to investigate the genetic basis of crude oil degradation and adaptation to growth in a nutrient-poor permafrost environment. Copyright © 2016 Elsevier B.V. All rights reserved.


July 7, 2019

Exploiting next-generation sequencing to solve the haplotyping puzzle in polyploids: a simulation study.

Haplotypes are the units of inheritance in an organism, and many genetic analyses depend on their precise determination. Methods for haplotyping single individuals use the phasing information available in next-generation sequencing reads, by matching overlapping single-nucleotide polymorphisms while penalizing post hoc nucleotide corrections made. Haplotyping diploids is relatively easy, but the complexity of the problem increases drastically for polyploid genomes, which are found in both model organisms and in economically relevant plant and animal species. Although a number of tools are available for haplotyping polyploids, the effects of the genomic makeup and the sequencing strategy followed on the accuracy of these methods have hitherto not been thoroughly evaluated.We developed the simulation pipeline haplosim to evaluate the performance of three haplotype estimation algorithms for polyploids: HapCompass, HapTree and SDhaP, in settings varying in sequencing approach, ploidy levels and genomic diversity, using tetraploid potato as the model. Our results show that sequencing depth is the major determinant of haplotype estimation quality, that 1?kb PacBio circular consensus sequencing reads and Illumina reads with large insert-sizes are competitive and that all methods fail to produce good haplotypes when ploidy levels increase. Comparing the three methods, HapTree produces the most accurate estimates, but also consumes the most resources. There is clearly room for improvement in polyploid haplotyping algorithms.


July 7, 2019

Complete genome sequence of Marivivens sp. JLT3646, a potential aromatic compound degrader

Marivivens sp. JLT3646 (CGMCC 1.15778), belonging to the phylum Alphaproteobacteria, was isolated from seawater, Kueishan Islet, offshore northeast of Taiwan. Here, we present the complete genome sequence of Marivivens sp. JLT3646, which contains a circular 2,978,145 bp chromosome with 56.2% G + C content, and one circular plasmid which is 169,066 bp in length. The genome data suggested that Marivivens sp. JLT3646 has the potential to degrade aromatic monomers, which might provide insight into biotechnological applications and facilitate the investigation of environmental bioremediation.


July 7, 2019

Complete genome sequence of human pathogen Kosakonia cowanii type strain 888-76T.

Kosakonia cowanii type strain 888-76T is a human pathogen which was originally isolated from blood as NIH group 42. In this study, we report the complete genome sequence of K. cowanii 888-76T. 888-76T has 1 chromosome and 2 plasmids with a total genome size of 4,857,567bp and C+G 56.15%. This genome sequence will not only help us to understand the virulence features of K. cowanii 888-76T but also provide us the useful information for the study of evolution of Kosakonia genus. Copyright © 2017 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.


July 7, 2019

Microbial metagenomics mock scenario-based sample simulation (M3S3).

Shotgun sequencing in increasingly applied in clinical microbiology for unbiased culture-independent diagnosis. While software solutions for metagenomics proliferate, integration of metagenomics in clinical care, requires method standardisation and validation. Virtual metagenomics samples could underpin validation by substituting real samples and thus we sought to develop a novel solution for simulation of metagenomics samples based on user-defined clinical scenarios.We designed the Microbial Metagenomics Mock Scenario-based Sample Simulation (M3S3) workflow, which allows users to generate virtual samples from raw reads or assemblies. The M3S3 output is a mock sample in FASTQ or FASTA format. M3S3 was tested by generating virtual samples for ten challenging infectious disease scenarios, involving a background matrix ‘spiked’ in silico with pathogens including mixtures. Replicate samples (seven per scenario) were used to represent different compositional ratios. Virtual samples were analysed using Taxonomer and Kraken db.The ten challenge scenarios were successfully applied, generating 80 samples. For all tested scenarios, the virtual samples showed sequence compositions as predicted from the user input. Spiked pathogen sequences were identified with the majority of the replicates and most exhibited acceptable abundance (deviation between expected and observed abundance of spiked pathogens), with slight differences observed between software tools.Despite demonstrated proof-of-concept, integration of clinical metagenomics in routine microbiology remains a substantial challenge. M3S3 is capable of producing virtual samples on-demand, simulating a spectrum of clinical diagnostic scenarios of varying complexity. The M3S3 tool can therefore support the development and validation of standardised metagenomics applications. Copyright © 2017. Published by Elsevier Ltd.


July 7, 2019

Collection and storage of HLA NGS genotyping data for the 17th International HLA and Immunogenetics Workshop.

For over 50?years, the International HLA and Immunogenetics Workshops (IHIW) have advanced the fields of histocompatibility and immunogenetics (H&I) via community sharing of technology, experience and reagents, and the establishment of ongoing collaborative projects. Held in the fall of 2017, the 17th IHIW focused on the application of next generation sequencing (NGS) technologies for clinical and research goals in the H&I fields. NGS technologies have the potential to allow dramatic insights and advances in these fields, but the scope and sheer quantity of data associated with NGS raise challenges for their analysis, collection, exchange and storage. The 17th IHIW adopted a centralized approach to these issues, and we developed the tools, services and systems to create an effective system for capturing and managing these NGS data. We worked with NGS platform and software developers to define a set of distinct but equivalent NGS typing reports that record NGS data in a uniform fashion. The 17th IHIW database applied our standards, tools and services to collect, validate and store those structured, multi-platform data in an automated fashion. We have created community resources to enable exploration of the vast store of curated sequence and allele-name data in the IPD-IMGT/HLA Database, with the goal of creating a long-term community resource that integrates these curated data with new NGS sequence and polymorphism data, for advanced analyses and applications. Copyright © 2017 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.


July 7, 2019

ReMILO: reference assisted misassembly detection algorithm using short and long reads.

Contigs assembled from the second generation sequencing short reads may contain misassemblies, and thus complicate downstream analysis or even lead to incorrect analysis results. Fortunately, with more and more sequenced species available, it becomes possible to use the reference genome of a closely related species to detect misassemblies. In addition, long reads of the third generation sequencing technology have been more and more widely used, and can also help detect misassemblies.Here, we introduce ReMILO, a reference assisted misassembly detection algorithm that uses both short reads and PacBio SMRT long reads. ReMILO aligns the initial short reads to both the contigs and reference genome, and then constructs a novel data structure called red-black multipositional de Bruijn graph to detect misassemblies. In addition, ReMILO also aligns the contigs to long reads and find their differences from the long reads to detect more misassemblies. In our performance test on short read assemblies of human chromosome 14 data, ReMILO can detect 41.8-77.9% extensive misassemblies and 33.6-54.5% local misassemblies. On hybrid short and long read assemblies of S.pastorianus data, ReMILO can also detect 60.6-70.9% extensive misassemblies and 28.6-54.0% local misassemblies.The ReMILO software can be downloaded for free under Artistic License 2.0 from this site: https://github.com/songc001/remilo.baoe@bjtu.edu.cn.Supplementary data are available at Bioinformatics online.© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com


July 7, 2019

Microbial sequence typing in the genomic era.

Next-generation sequencing (NGS), also known as high-throughput sequencing, is changing the field of microbial genomics research. NGS allows for a more comprehensive analysis of the diversity, structure and composition of microbial genes and genomes compared to the traditional automated Sanger capillary sequencing at a lower cost. NGS strategies have expanded the versatility of standard and widely used typing approaches based on nucleotide variation in several hundred DNA sequences and a few gene fragments (MLST, MLVA, rMLST and cgMLST). NGS can now accommodate variation in thousands or millions of sequences from selected amplicons to full genomes (WGS, NGMLST and HiMLST). To extract signals from high-dimensional NGS data and make valid statistical inferences, novel analytic and statistical techniques are needed. In this review, we describe standard and new approaches for microbial sequence typing at gene and genome levels and guidelines for subsequent analysis, including methods and computational frameworks. We also present several applications of these approaches to some disciplines, namely genotyping, phylogenetics and molecular epidemiology. Copyright © 2017 Elsevier B.V. All rights reserved.


July 7, 2019

Genomic insights into Photobacterium damselae subsp. damselae strain KC-Na-1, isolated from the finless porpoise (Neophocaena asiaeorientalis)

Photobacterium damselae subsp. damselae (PDD) is a marine bacterium that can infect a variety of marine animals and humans. Although this bacterium has been isolated from several stranded dolphins and whales, its pathogenic role in cetaceans is still unclear. In this study, we report the complete genome of PDD strain KC-Na-1 isolated from a finless porpoise (Neophocaena asiaeorientalis) rescued from the South Sea (Republic of Korea). The sequenced genome comprised two chromosomes and four plasmids. Among the recently identified major virulence factors in PDD, only phospholipase (plpV) was found in strain KC-Na-1. Interestingly, two genes homologous to Vibrio thermostable direct hemolysin (tdh) and its transcriptional regulator toxR, which are known virulence factors associated with Vibrio parahaemolyticus, were encoded on the plasmid pPDD-Na-1-3. Based on these results, strain KC-Na-1 may have potential pathogenicity in humans and other marine animals and also could act as a potential virulent strain. To the best of our knowledge, this is the first report of the complete genome sequence of P. damselae.


July 7, 2019

A high throughput screen for active human transposable elements.

Transposable elements (TEs) are mobile genetic sequences that randomly propagate within their host’s genome. This mobility has the potential to affect gene transcription and cause disease. However, TEs are technically challenging to identify, which complicates efforts to assess the impact of TE insertions on disease. Here we present a targeted sequencing protocol and computational pipeline to identify polymorphic and novel TE insertions using next-generation sequencing: TE-NGS. The method simultaneously targets the three subfamilies that are responsible for the majority of recent TE activity (L1HS, AluYa5/8, and AluYb8/9) thereby obviating the need for multiple experiments and reducing the amount of input material required.Here we describe the laboratory protocol and detection algorithm, and a benchmark experiment for the reference genome NA12878. We demonstrate a substantial enrichment for on-target fragments, and high sensitivity and precision to both reference and NA12878-specific insertions. We report 17 previously unreported loci for this individual which are supported by orthogonal long-read evidence, and we identify 1470 polymorphic and novel TEs in 12 additional samples that were previously undocumented in databases of insertion polymorphisms.We anticipate that future applications of TE-NGS alongside exome sequencing of patients with sporadic disease will reduce the number of unresolved cases, and improve estimates of the contribution of TEs to human genetic disease.


July 7, 2019

New high copy tandem repeat in the content of the chicken W chromosome.

The content of repetitive DNA in avian genomes is considerably less than in other investigated vertebrates. The first descriptions of tandem repeats were based on the results of routine biochemical and molecular biological experiments. Both satellite DNA and interspersed repetitive elements were annotated using library-based approach and de novo repeat identification in assembled genome. The development of deep-sequencing methods provides datasets of high quality without preassembly allowing one to annotate repetitive elements from unassembled part of genomes. In this work, we search the chicken assembly and annotate high copy number tandem repeats from unassembled short raw reads. Tandem repeat (GGAAA)n has been identified and found to be the second after telomeric repeat (TTAGGG)n most abundant in the chicken genome. Furthermore, (GGAAA)n repeat forms expanded arrays on the both arms of the chicken W chromosome. Our results highlight the complexity of repetitive sequences and update data about organization of sex W chromosome in chicken.


July 7, 2019

Cupriavidus malaysiensis sp. nov., a novel poly(3-hydroxybutyrate-co-4-hydroxybutyrate) accumulating bacterium isolated from the Malaysian environment.

Bacterial classification on the basis of a polyphasic approach was conducted on three poly(3 hydroxybutyrate-co-4-hydroxybutyrate) [P(3HB-co-4HB)] accumulating bacterial strains that were isolated from samples collected from Malaysian environments; Kulim Lake, Sg. Pinang river and Sg. Manik paddy field. The Gram-negative, rod-shaped, motile, non-sporulating and non-fermenting bacteria were shown to belong to the genus Cupriavidus of the Betaproteobacteria on the basis of their 16S rRNA gene sequence analyses. The sequence similarity value with their near phylogenetic neighbour, Cupriavidus pauculus LMG3413T, was 98.5%. However, the DNA-DNA hybridization values (8-58%) and ribotyping analysis both enabled these strains to be differentiated from related Cupriavidus species with validly published names. The RiboPrint patterns of the three strains also revealed that the strains were genetically related even though they displayed a clonal diversity. The major cellular fatty acids detected in these strains included C15:0 ISO 2OH/C16:1 ?7c, hexadecanoic (16:0) and cis-11-octadecenoic (C18:1 ?7c). Their G+C contents ranged from 68.0  to 68.6 mol%, and their major isoprenoid quinone was Ubiquinone Q-8. Of these three strains, only strain USMAHM13 (= DSM 25816 = KCTC 32390) was discovered to exhibit yellow pigmentation that is characteristic of the carotenoid family. Their assembled genomes also showed that the three strains were not identical in terms of their genome sizes that were 7.82, 7.95 and 8.70 Mb for strains USMAHM13, USMAA1020 and USMAA2-4, respectively, which are slightly larger than that of Cupriavidus necator H16 (7.42 Mb). The average nucleotide identity (ANI) results indicated that the strains were genetically related and the genome pairs belong to the same species. On the basis of the results obtained in this study, the three strains are considered to represent a novel species for which the name Cupriavidus malaysiensis sp. nov. is proposed. The type strain of the species is USMAA1020T (= DSM 19416T = KCTC 32390T).


July 7, 2019

FDA-CDC antimicrobial resistance isolate bank: A publicly-available resource to support research, development and regulatory requirements.

The FDA-CDC Antimicrobial Resistance Isolate Bank was created in July 2015 as a publicly available resource to combat antimicrobial resistance. It is a curated repository of bacterial isolates with an assortment of clinically-important resistance mechanisms that have been phenotypically and genotypically characterized. In the first two years of operation, the Bank offered 14 panels comprising 496 unique isolates and had filled 486 orders from 394 institutions throughout the United States. New panels are being added. Copyright © 2017 American Society for Microbiology.


July 7, 2019

De novo mutations resolve disease transmission pathways in clonal malaria

Detecting de novo mutations in viral and bacterial pathogens enables researchers to reconstruct detailed networks of disease transmission and is a key technique in genomic epidemiology. However, these techniques have not yet been applied to the malaria parasite, Plasmodium falciparum, in which a larger genome, slower generation times, and a complex life cycle make them difficult to implement. Here, we demonstrate the viability of de novo mutation studies in P. falciparum for the first time. Using a combination of sequencing, library preparation, and genotyping methods that have been optimized for accuracy in low-complexity genomic regions, we have detected de novo mutations that distinguish nominally identical parasites from clonal lineages. Despite its slower evolutionary rate compared with bacterial or viral species, de novo mutation can be detected in P. falciparum across timescales of just 1-2?years and evolutionary rates in low-complexity regions of the genome can be up to twice that detected in the rest of the genome. The increased mutation rate allows the identification of separate clade expansions that cannot be found using previous genomic epidemiology approaches and could be a crucial tool for mapping residual transmission patterns in disease elimination campaigns and reintroduction scenarios.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.