Menu
July 7, 2019

Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences.

Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool-Genome Puzzle Master (GPM)-that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules.With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to ‘group,’ ‘merge,’ ‘order and orient’ sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user’s total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory.The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS CONTACTS: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.


July 7, 2019

The report of my death was an exaggeration: A review for researchers using microsatellites in the 21st century.

Microsatellites, or simple sequence repeats (SSRs), have long played a major role in genetic studies due to their typically high polymorphism. They have diverse applications, including genome mapping, forensics, ascertaining parentage, population and conservation genetics, identification of the parentage of polyploids, and phylogeography. We compare SSRs and newer methods, such as genotyping by sequencing (GBS) and restriction site associated DNA sequencing (RAD-Seq), and offer recommendations for researchers considering which genetic markers to use. We also review the variety of techniques currently used for identifying microsatellite loci and developing primers, with a particular focus on those that make use of next-generation sequencing (NGS). Additionally, we review software for microsatellite development and report on an experiment to assess the utility of currently available software for SSR development. Finally, we discuss the future of microsatellites and make recommendations for researchers preparing to use microsatellites. We argue that microsatellites still have an important place in the genomic age as they remain effective and cost-efficient markers.


July 7, 2019

An ultra-high density genetic linkage map of perennial ryegrass (Lolium perenne) using genotyping by sequencing (GBS) based on a reference shotgun genome assembly.

High density genetic linkage maps that are extensively anchored to assembled genome sequences of the organism in question are extremely useful in gene discovery. To facilitate this process in perennial ryegrass (Lolium perenne L.), a high density single nucleotide polymorphism (SNP)- and presence/absence variant (PAV)-based genetic linkage map has been developed in an F2 mapping population that has been used as a reference population in numerous studies. To provide a reference sequence to which to align genotyping by sequencing (GBS) reads, a shotgun assembly of one of the grandparents of the population, a tenth-generation inbred line, was created using Illumina-based sequencing.The assembly was based on paired-end Illumina reads, scaffolded by mate pair and long jumping distance reads in the range of 3-40?kb, with >200-fold initial genome coverage. A total of 169 individuals from an F2 mapping population were used to construct PstI-based GBS libraries tagged with unique 4-9 nucleotide barcodes, resulting in 284 million reads, with approx. 1·6 million reads per individual. A bioinformatics pipeline was employed to identify both SNPs and PAVs. A core genetic map was generated using high confidence SNPs, to which lower confidence SNPs and PAVs were subsequently fitted in a straightforward binning approach.The assembly comprises 424?750 scaffolds, covering 1·11 Gbp of the 2·5 Gbp perennial ryegrass genome, with a scaffold N50 of 25 212?bp and a contig N50 of 3790?bp. It is available for download, and access to a genome browser has been provided. Comparison of the assembly with available transcript and gene model data sets for perennial ryegrass indicates that approx. 570 Mbp of the gene-rich portion of the genome has been captured. An ultra-high density genetic linkage map with 3092 SNPs and 7260 PAVs was developed, anchoring just over 200?Mb of the reference assembly.The combined genetic map and assembly, combined with another recently released genome assembly, represent a significant resource for the perennial ryegrass genetics community.© The Author 2016. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.


July 7, 2019

Whole-genome sequencing recommendations

Recent technological developments have revolutionized the way we perform genetic analyses. In particular whole-genome sequencing provides access to the entire genetic makeup of an individual, and it is now an affordable approach for many research groups. As a consequence genome sequencing is pervading many fields of biological research. Sequencing technologies are evolving rapidly and so do their applications. Here we provide a first primer on whole-genome sequencing, focusing on two of the most popular applications: (1) de novo genome sequencing, in which the objective is obtaining a high-quality genome assembly that can serve as a reference for a species or variety, and (2) genome resequencing, when there is an available reference genome and the objective is to map sequence variation of an individual or a set of individuals. It is not our intention to provide a comprehensive overview of current methodologies that will likely soon become obsolete, but rather focus on general principles that will have a more general applicability.


July 7, 2019

Salmonella degrades the host glycocalyx leading to altered infection and glycan remodeling.

Complex glycans cover the gut epithelial surface to protect the cell from the environment. Invasive pathogens must breach the glycan layer before initiating infection. While glycan degradation is crucial for infection, this process is inadequately understood. Salmonella contains 47 glycosyl hydrolases (GHs) that may degrade the glycan. We hypothesized that keystone genes from the entire GH complement of Salmonella are required to degrade glycans to change infection. This study determined that GHs recognize the terminal monosaccharides (N-acetylneuraminic acid (Neu5Ac), galactose, mannose, and fucose) and significantly (p?


July 7, 2019

Privacy-preserving read mapping using locality sensitive hashing and secure kmer voting

The recent explosion in the amount of available genome sequencing data imposes high computational demands on the tools designed to analyze it. Low-cost cloud computing has the potential to alleviate this burden. However, moving personal genome data analysis to the cloud raises serious privacy concerns. Read alignment is a critical and computationally intensive first step of most genomic data analysis pipelines. While significant effort has been dedicated to optimize the sensitivity and runtime efficiency of this step, few approaches have addressed outsourcing this computation securely to an untrusted party. The few secure solutions that have been proposed either do not scale to whole genome sequencing datasets or are not competitive with the state of the art in read mapping. In this paper, we present BALAUR, a privacy-preserving read mapping algorithm based on locality sensitive hashing and secure kmer voting. BALAUR securely outsources a significant portion of the computation to the public cloud by formulating the alignment task as a voting scheme between encrypted read and reference kmers. Our approach can easily handle typical genome-scale datasets and is highly competitive with non-cryptographic state-of-the-art read aligners in both accuracy and runtime performance on simulated and real read data. Moreover, our approach is significantly faster than state-of-the-art read aligners in long read mapping.


July 7, 2019

Evolutionary architecture of the infant-adapted group of Bifidobacterium species associated with the probiotic function.

Bifidobacteria, often associated with the gastrointestinal tract of animals, are well known for their roles as probiotics. Among the dozens of Bifidobacterium species, Bifidobacterium bifidum, B. breve, and B. longum are the ones most frequently isolated from the feces of infants and known to help the digestion of human milk oligosaccharides. To investigate the correlation between the metabolic properties of bifidobacteria and their phylogeny, we performed a phylogenomic analysis based on 452 core genes of forty-four completely sequenced Bifidobacterium species. Results show that a major evolutionary event leading to the clade of the infant-adapted species is linked to carbohydrate metabolism, but it is not the only factor responsible for the adaptation of bifidobacteria to the gut. The genome of B. longum subsp. infantis, a typical bifidobacterium in the gut of breast-fed infants, encodes proteins associated with several kinds of species-specific metabolic pathways, including urea metabolism and biosynthesis of riboflavin and lantibiotics. Our results demonstrate that these metabolic features, which are associated with the probiotic function of bifidobacteria, are species-specific and highly correlate with their phylogeny. Copyright © 2016 Elsevier GmbH. All rights reserved.


July 7, 2019

Association between progranulin and Gaucher disease.

Gaucher disease (GD) is a genetic disease caused by mutations in the GBA1 gene which result in reduced enzymatic activity of ß-glucocerebrosidase (GCase). This study identified the progranulin (PGRN) gene (GRN) as another gene associated with GD.Serum levels of PGRN were measured from 115 GD patients and 99 healthy controls, whole GRN gene from 40 GD patients was sequenced, and the genotyping of 4 SNPs identified in GD patients was performed in 161 GD and 142 healthy control samples. Development of GD in PGRN-deficient mice was characterized, and the therapeutic effect of rPGRN on GD analyzed.Serum PGRN levels were significantly lower in GD patients (96.65±53.45ng/ml) than those in healthy controls of the general population (164.99±43.16ng/ml, p<0.0001) and of Ashkenazi Jews (150.64±33.99ng/ml, p<0.0001). Four GRN gene SNPs, including rs4792937, rs78403836, rs850713, and rs5848, and three point mutations, were identified in a full-length GRN gene sequencing in 40 GD patients. Large scale SNP genotyping in 161 GD and 142 healthy controls was conducted and the four SNP sites have significantly higher frequency in GD patients. In addition, "aged" and challenged adult PGRN null mice develop GD-like phenotypes, including typical Gaucher-like cells in lung, spleen, and bone marrow. Moreover, lysosomes in PGRN KO mice exhibit a tubular-like appearance. PGRN is required for the lysosomal appearance of GCase and its deficiency leads to GCase accumulation in the cytoplasm. More importantly, recombinant PGRN is therapeutic in various animal models of GD and human fibroblasts from GD patients.Our data demonstrates an unknown association between PGRN and GD and identifies PGRN as an essential factor for GCase's lysosomal localization. These findings not only provide new insight into the pathogenesis of GD, but may also have implications for diagnosis and alternative targeted therapies for GD. Copyright © 2016 Forschungsgesellschaft für Arbeitsphysiologie und Arbeitschutz e.V. Published by Elsevier B.V. All rights reserved.


July 7, 2019

Strategies for sequence assembly of plant genomes

The field of plant genome assembly has greatly benefited from the development and widespread adoption of next-generation DNA sequencing platforms. Very high sequencing throughputs and low costs per nucleotide have considerably reduced the technical and budgetary constraints associated with early assembly projects done primarily with a traditional Sanger-based approach. Those improvements led to a sharp increase in the number of plant genomes being sequenced, including large and complex genomes of economically important crops. Although next-generation DNA sequencing has considerably improved our understanding of the overall structure and dynamics of many plant genomes, severe limitations still remain because next-generation DNA sequencing reads typically are shorter than Sanger reads. In addition, the software tools used to de novo assemble sequences are not necessarily designed to optimize the use of short reads. These cause challenges, common to many plant species with large genome sizes, high repeat contents, polyploidy and genome-wide duplications. This chapter provides an overview of historical and current methods used to sequence and assemble plant genomes, along with new solutions offered by the emergence of technologies such as single molecule sequencing and optical mapping to address the limitations of current sequence assemblies.


July 7, 2019

Development of Streptomyces sp. FR-008 as an emerging chassis

Microbial-derived natural products are important in both the pharmaceutical industry and academic research. As the metabolic potential of original producer especially Streptomyces is often limited by slow growth rate, complicated cultivation profile, and unfeasible genetic manipulation, so exploring a Streptomyces as a super industrial chassis is valuable and urgent. Streptomyces sp. FR-008 is a fast-growing microorganism and can also produce a considerable amount of macrolide candicidin via modular polyketide synthase. In this study, we evaluated Streptomyces sp. FR-008 as a potential industrial-production chassis. First, PacBio sequencing and transcriptome analyses indicated that the Streptomyces sp. FR-008 genome size is 7.26 Mb, which represents one of the smallest of currently sequenced Streptomyces genomes. In addition, we simplified the conjugation procedure without heat-shock and pre-germination treatments but with high conjugation efficiency, suggesting it is inherently capable of accepting heterologous DNA. In addition, a series of promoters selected from literatures was assessed based on GusA activity in Streptomyces sp. FR-008. Compared with the common used promoter ermE*-p, the strength of these promoters comprise a library with a constitutive range of 60–860%, thus providing the useful regulatory elements for future genetic engineering purpose. In order to minimum the genome, we also target deleted three endogenous polyketide synthase (PKS) gene clusters to generate a mutant LQ3. LQ3 is thus an “updated” version of Streptomyces sp. FR-008, producing fewer secondary metabolites profiles than Streptomyces sp. FR-008. We believe this work could facilitate further development of Streptomyces sp. FR-008 for use in biotechnological applications.


July 7, 2019

Emergence of epidemic Neisseria meningitidis serogroup C in Niger, 2015: an analysis of national surveillance data.

To combat Neisseria meningitidis serogroup A epidemics in the meningitis belt of sub-Saharan Africa, a meningococcal serogroup A conjugate vaccine (MACV) has been progressively rolled out since 2010. We report the first meningitis epidemic in Niger since the nationwide introduction of MACV.We compiled and analysed nationwide case-based meningitis surveillance data in Niger. Cases were confirmed by culture or direct real-time PCR, or both, of cerebrospinal fluid specimens, and whole-genome sequencing was used to characterise isolates. Information on vaccination campaigns was collected by the Niger Ministry of Health and WHO.From Jan 1 to June 30, 2015, 9367 suspected meningitis cases and 549 deaths were reported in Niger. Among 4301 cerebrospinal fluid specimens tested, 1603 (37·3%) were positive for a bacterial pathogen, including 1147 (71·5%) that were positive for N meningitidis serogroup C (NmC). Whole-genome sequencing of 77 NmC isolates revealed the strain to be ST-10217. Although vaccination campaigns were limited in scope because of a global vaccine shortage, 1·4 million people were vaccinated from March to June, 2015.This epidemic represents the largest global NmC outbreak so far and shows the continued threat of N meningitidis in sub-Saharan Africa. The risk of further regional expansion of this novel clone highlights the need for continued strengthening of case-based surveillance. The availability of an affordable, multivalent conjugate vaccine may be important in future epidemic response.MenAfriNet consortium, a partnership between the US Centers for Disease Control and Prevention, WHO, and Agence de Médecine Preventive, through a grant from the Bill & Melinda Gates Foundation. Copyright © 2016 World Health Organization. Published by Elsevier Ltd/Inc/BV. All rights reserved. Published by Elsevier Ltd.. All rights reserved.


July 7, 2019

Next-generation sequencing: a diagnostic one-stop shop for Hepatitis C?

Before starting chronic hepatitis C treatment, the viral genotype/subtype has to be accurately determined and potentially coupled with drug resistance testing. Due to the high genetic variability of the hepatitis C virus, this can be a demanding task that can potentially be streamlined by viral whole-genome sequencing using next-generation sequencing as demonstrated by an article in this issue of the Journal of Clinical Microbiology by E. Thomson, C. L. C. Ip, A. Badhan, M. T. Christiansen, W. Adamson, et al. (J Clin Microbiol. 54:2455-2469, 2016, http://dx.doi.org/10.1128/JCM.00330-16). Copyright © 2016, American Society for Microbiology. All Rights Reserved.


July 7, 2019

Hyper-eccentric structural genes in the mitochondrial genome of the algal parasite Hemistasia phaeocysticola.

Diplonemid mitochondria are considered to have very eccentric structural genes. Coding regions of individual diplonemid mitochondrial genes are fragmented into small pieces and found on different circular DNAs. Short RNAs transcribed from each DNA molecule mature through a unique RNA maturation process involving assembly and three types of RNA editing (i.e., U insertion and A-to-I & C-to-U substitutions), although the molecular mechanism(s) of RNA maturation and the evolutionary history of these eccentric structural genes still remain to be understood. Since the gene fragmentation pattern is generally conserved among the diplonemid species studied to date, it was considered that their structural complexity has plateaued and further gene fragmentation could not occur. Here, we show the mitochondrial gene structure of Hemistasia phaeocysticola, which was recently identified as a member of a novel lineage in diplonemids, by comparison of the mitochondrial DNA sequences with cDNA sequences synthesized from mature mRNA. The genes of H. phaeocysticola are fragmented much more finely than those of other diplonemids studied to date. Furthermore, in addition to all known types of RNA editing, it is suggested that a novel processing step (i.e., secondary RNA insertion) is involved in the RNA maturation in the mitochondria of H. phaeocysticola Our findings demonstrate the tremendous plasticity of mitochondrial gene structures.© The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

The effects of signal erosion and core genome reduction on the identification of diagnostic markers.

Whole-genome sequence (WGS) data are commonly used to design diagnostic targets for the identification of bacterial pathogens. To do this effectively, genomics databases must be comprehensive to identify the strict core genome that is specific to the target pathogen. As additional genomes are analyzed, the core genome size is reduced and there is erosion of the target-specific regions due to commonality with related species, potentially resulting in the identification of false positives and/or false negatives.A comparative analysis of 1,130 Burkholderia genomes identified unique markers for many named species, including the human pathogens B. pseudomallei and B. mallei Due to core genome reduction and signature erosion, only 38 targets specific to B. pseudomallei/mallei were identified. By using only public genomes, a larger number of markers were identified, due to undersampling, and this larger number represents the potential for false positives. This analysis has implications for the design of diagnostics for other species where the genomic space of the target and/or closely related species is not well defined. Copyright © 2016 Sahl et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.