Menu
July 7, 2019

Genomic dark matter illuminated: Anopheles Y chromosomes.

Hall et al. have strategically used long-read sequencing technology to characterize the structure and highly repetitive content of the Y chromosome in Anopheles malaria mosquitoes. Their work confirms that this important but elusive heterochromatic sex chromosome is evolving extremely rapidly and harbors a remarkably small number of genes. Copyright © 2016 Elsevier Ltd. All rights reserved.


July 7, 2019

The Solanum demissum R8 late blight resistance gene is an Sw-5 homologue that has been deployed worldwide in late blight resistant varieties.

The potato late blight resistance gene R8 has been cloned. R8 is found in five late blight resistant varieties deployed in three different continents. R8 recognises Avr8 and is homologous to the NB-LRR protein Sw-5 from tomato. The broad spectrum late blight resistance gene R8 from Solanum demissum was cloned based on a previously published coarse map position on the lower arm of chromosome IX. Fine mapping in a recombinant population and bacterial artificial chromosome (BAC) library screening resulted in a BAC contig spanning 170 kb of the R8 haplotype. Sequencing revealed a cluster of at least ten R gene analogues (RGAs). The seven RGAs in the genetic window were subcloned for complementation analysis. Only one RGA provided late blight resistance and caused recognition of Avr8. From these results, it was concluded that the newly cloned resistance gene was indeed R8. R8 encodes a typical intracellular immune receptor with an N-terminal coiled coil, a central nucleotide binding site and 13 C-terminal leucine rich repeats. Phylogenetic analysis of a set of representative Solanaceae R proteins shows that R8 resides in a clearly distinct clade together with the Sw-5 tospovirus R protein from tomato. It was found that the R8 gene is present in late blight resistant potato varieties from Europe (Sarpo Mira), USA (Jacqueline Lee, Missaukee) and China (PB-06, S-60). Indeed, when tested under field conditions, R8 transgenic potato plants showed broad spectrum resistance to the current late blight population in the Netherlands, similar to Sarpo Mira.


July 7, 2019

Microsatellite length scoring by Single Molecule Real Time Sequencing – Effects of sequence structure and PCR regime.

Microsatellites are DNA sequences consisting of repeated, short (1-6 bp) sequence motifs that are highly mutable by enzymatic slippage during replication. Due to their high intrinsic variability, microsatellites have important applications in population genetics, forensics, genome mapping, as well as cancer diagnostics and prognosis. The current analytical standard for microsatellites is based on length scoring by high precision electrophoresis, but due to increasing efficiency next-generation sequencing techniques may provide a viable alternative. Here, we evaluated single molecule real time (SMRT) sequencing, implemented in the PacBio series of sequencing apparatuses, as a means of microsatellite length scoring. To this end we carried out multiplexed SMRT sequencing of plasmid-carried artificial microsatellites of varying structure under different pre-sequencing PCR regimes. For each repeat structure, reads corresponding to the target length dominated. We found that pre-sequencing amplification had large effects on scoring accuracy and error distribution relative to controls, but that the effects of the number of amplification cycles were generally weak. In line with expectations enzymatic slippage decreased proportionally with microsatellite repeat unit length and increased with repetition number. Finally, we determined directional mutation trends, showing that PCR and SMRT sequencing introduced consistent but opposing error patterns in contraction and expansion of the microsatellites on the repeat motif and single nucleotide level.


July 7, 2019

Structural basis for recombinatorial permissiveness in the generation of Anaplasma marginale Msp2 antigenic variants.

Sequential expression of outer membrane protein antigenic variants is an evolutionarily convergent mechanism used by bacterial pathogens to escape host immune clearance and establish persistent infection. Variants must be sufficiently structurally distinct to escape existing immune effectors yet retain core structural elements required for localization and function within the outer membrane. We examined this balance using Anaplasma marginale, which generates antigenic variants in the outer membrane protein Msp2 using gene conversion. The overwhelming majority of Msp2 variants expressed during long-term persistent infection are mosaics, derived by recombination of oligonucleotide segments from multiple alleles to form unique hypervariable regions (HVR). As a result, the mosaics are not under long-term selective pressure to encode a functional protein; consequently, we hypothesized that the Msp2 HVR is structurally permissive for mosaic expression. Using an integrated approach of predictive modeling with determination of native Msp2 protein structure and function, we demonstrate that structured elements, most notably ß-sheets, are significantly concentrated in the highly conserved N- and C-terminal domains. In contrast the HVR is overwhelmingly random coil with the structured a-helices and ß-sheets confined to the genomically defined “structural tethers” that separate the antigenically variable microdomains. This structure is supported by the surface exposure of the HVR microdomains and the slow diffusion type porin function in native Msp2. Importantly, the predominance of random coil provides plasticity for formation of functional HVR mosaics and realization of the full potential of segmental gene conversion to dramatically expand the variant repertoire. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


July 7, 2019

SAR11 bacteria linked to ocean anoxia and nitrogen loss.

Bacteria of the SAR11 clade constitute up to one half of all microbial cells in the oxygen-rich surface ocean. SAR11 bacteria are also abundant in oxygen minimum zones (OMZs), where oxygen falls below detection and anaerobic microbes have vital roles in converting bioavailable nitrogen to N2 gas. Anaerobic metabolism has not yet been observed in SAR11, and it remains unknown how these bacteria contribute to OMZ biogeochemical cycling. Here, genomic analysis of single cells from the world’s largest OMZ revealed previously uncharacterized SAR11 lineages with adaptations for life without oxygen, including genes for respiratory nitrate reductases (Nar). SAR11 nar genes were experimentally verified to encode proteins catalysing the nitrite-producing first step of denitrification and constituted ~40% of OMZ nar transcripts, with transcription peaking in the anoxic zone of maximum nitrate reduction activity. These results link SAR11 to pathways of ocean nitrogen loss, redefining the ecological niche of Earth’s most abundant organismal group.


July 7, 2019

Distinct Salmonella enteritidis lineages associated with enterocolitis in high-income settings and invasive disease in low-income settings.

An epidemiological paradox surrounds Salmonella enterica serovar Enteritidis. In high-income settings, it has been responsible for an epidemic of poultry-associated, self-limiting enterocolitis, whereas in sub-Saharan Africa it is a major cause of invasive nontyphoidal Salmonella disease, associated with high case fatality. By whole-genome sequence analysis of 675 isolates of S. Enteritidis from 45 countries, we show the existence of a global epidemic clade and two new clades of S. Enteritidis that are geographically restricted to distinct regions of Africa. The African isolates display genomic degradation, a novel prophage repertoire, and an expanded multidrug resistance plasmid. S. Enteritidis is a further example of a Salmonella serotype that displays niche plasticity, with distinct clades that enable it to become a prominent cause of gastroenteritis in association with the industrial production of eggs and of multidrug-resistant, bloodstream-invasive infection in Africa.


July 7, 2019

Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana.

Approximately seven hundred 45S rRNA genes (rDNA) in the Arabidopsis thaliana genome are organised in two 4 Mbp-long arrays of tandem repeats arranged in head-to-tail fashion separated by an intergenic spacer (IGS). These arrays make up 5?% of the A. thaliana genome. IGS are rapidly evolving sequences and frequent rearrangements inside the rDNA loci have generated considerable interspecific and even intra-individual variability which allows to distinguish among otherwise highly conserved rRNA genes. The IGS has not been comprehensively described despite its potential importance in regulation of rDNA transcription and replication. Here we describe the detailed sequence variation in the complete IGS of A. thaliana WT plants and provide the reference/consensus IGS sequence, as well as genomic DNA analysis. We further investigate mutants dysfunctional in chromatin assembly factor-1 (CAF-1) (fas1 and fas2 mutants), which are known to have a reduced number of rDNA copies, and plant lines with restored CAF-1 function (segregated from a fas1xfas2 genetic background) showing major rDNA rearrangements. The systematic rDNA loss in CAF-1 mutants leads to the decreased variability of the IGS and to the occurrence of distinct IGS variants. We present for the first time a comprehensive and representative set of complete IGS sequences, obtained by conventional cloning and by Pacific Biosciences sequencing. Our data expands the knowledge of the A. thaliana IGS sequence arrangement and variability, which has not been available in full and in detail until now. This is also the first study combining IGS sequencing data with RFLP analysis of genomic DNA.


July 7, 2019

Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences.

Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool-Genome Puzzle Master (GPM)-that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules.With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to ‘group,’ ‘merge,’ ‘order and orient’ sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user’s total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory.The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS CONTACTS: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.


July 7, 2019

The report of my death was an exaggeration: A review for researchers using microsatellites in the 21st century.

Microsatellites, or simple sequence repeats (SSRs), have long played a major role in genetic studies due to their typically high polymorphism. They have diverse applications, including genome mapping, forensics, ascertaining parentage, population and conservation genetics, identification of the parentage of polyploids, and phylogeography. We compare SSRs and newer methods, such as genotyping by sequencing (GBS) and restriction site associated DNA sequencing (RAD-Seq), and offer recommendations for researchers considering which genetic markers to use. We also review the variety of techniques currently used for identifying microsatellite loci and developing primers, with a particular focus on those that make use of next-generation sequencing (NGS). Additionally, we review software for microsatellite development and report on an experiment to assess the utility of currently available software for SSR development. Finally, we discuss the future of microsatellites and make recommendations for researchers preparing to use microsatellites. We argue that microsatellites still have an important place in the genomic age as they remain effective and cost-efficient markers.


July 7, 2019

Privacy-preserving read mapping using locality sensitive hashing and secure kmer voting

The recent explosion in the amount of available genome sequencing data imposes high computational demands on the tools designed to analyze it. Low-cost cloud computing has the potential to alleviate this burden. However, moving personal genome data analysis to the cloud raises serious privacy concerns. Read alignment is a critical and computationally intensive first step of most genomic data analysis pipelines. While significant effort has been dedicated to optimize the sensitivity and runtime efficiency of this step, few approaches have addressed outsourcing this computation securely to an untrusted party. The few secure solutions that have been proposed either do not scale to whole genome sequencing datasets or are not competitive with the state of the art in read mapping. In this paper, we present BALAUR, a privacy-preserving read mapping algorithm based on locality sensitive hashing and secure kmer voting. BALAUR securely outsources a significant portion of the computation to the public cloud by formulating the alignment task as a voting scheme between encrypted read and reference kmers. Our approach can easily handle typical genome-scale datasets and is highly competitive with non-cryptographic state-of-the-art read aligners in both accuracy and runtime performance on simulated and real read data. Moreover, our approach is significantly faster than state-of-the-art read aligners in long read mapping.


July 7, 2019

Association between progranulin and Gaucher disease.

Gaucher disease (GD) is a genetic disease caused by mutations in the GBA1 gene which result in reduced enzymatic activity of ß-glucocerebrosidase (GCase). This study identified the progranulin (PGRN) gene (GRN) as another gene associated with GD.Serum levels of PGRN were measured from 115 GD patients and 99 healthy controls, whole GRN gene from 40 GD patients was sequenced, and the genotyping of 4 SNPs identified in GD patients was performed in 161 GD and 142 healthy control samples. Development of GD in PGRN-deficient mice was characterized, and the therapeutic effect of rPGRN on GD analyzed.Serum PGRN levels were significantly lower in GD patients (96.65±53.45ng/ml) than those in healthy controls of the general population (164.99±43.16ng/ml, p<0.0001) and of Ashkenazi Jews (150.64±33.99ng/ml, p<0.0001). Four GRN gene SNPs, including rs4792937, rs78403836, rs850713, and rs5848, and three point mutations, were identified in a full-length GRN gene sequencing in 40 GD patients. Large scale SNP genotyping in 161 GD and 142 healthy controls was conducted and the four SNP sites have significantly higher frequency in GD patients. In addition, "aged" and challenged adult PGRN null mice develop GD-like phenotypes, including typical Gaucher-like cells in lung, spleen, and bone marrow. Moreover, lysosomes in PGRN KO mice exhibit a tubular-like appearance. PGRN is required for the lysosomal appearance of GCase and its deficiency leads to GCase accumulation in the cytoplasm. More importantly, recombinant PGRN is therapeutic in various animal models of GD and human fibroblasts from GD patients.Our data demonstrates an unknown association between PGRN and GD and identifies PGRN as an essential factor for GCase's lysosomal localization. These findings not only provide new insight into the pathogenesis of GD, but may also have implications for diagnosis and alternative targeted therapies for GD. Copyright © 2016 Forschungsgesellschaft für Arbeitsphysiologie und Arbeitschutz e.V. Published by Elsevier B.V. All rights reserved.


July 7, 2019

Hyper-eccentric structural genes in the mitochondrial genome of the algal parasite Hemistasia phaeocysticola.

Diplonemid mitochondria are considered to have very eccentric structural genes. Coding regions of individual diplonemid mitochondrial genes are fragmented into small pieces and found on different circular DNAs. Short RNAs transcribed from each DNA molecule mature through a unique RNA maturation process involving assembly and three types of RNA editing (i.e., U insertion and A-to-I & C-to-U substitutions), although the molecular mechanism(s) of RNA maturation and the evolutionary history of these eccentric structural genes still remain to be understood. Since the gene fragmentation pattern is generally conserved among the diplonemid species studied to date, it was considered that their structural complexity has plateaued and further gene fragmentation could not occur. Here, we show the mitochondrial gene structure of Hemistasia phaeocysticola, which was recently identified as a member of a novel lineage in diplonemids, by comparison of the mitochondrial DNA sequences with cDNA sequences synthesized from mature mRNA. The genes of H. phaeocysticola are fragmented much more finely than those of other diplonemids studied to date. Furthermore, in addition to all known types of RNA editing, it is suggested that a novel processing step (i.e., secondary RNA insertion) is involved in the RNA maturation in the mitochondria of H. phaeocysticola Our findings demonstrate the tremendous plasticity of mitochondrial gene structures.© The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

Third-generation sequencing and analysis of four complete pig liver esterase gene sequences in clones identified by screening BAC library.

Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing.After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis.Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression.This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes provides the necessary foundation for investigation of the genetic structure, function, and regulatory mechanisms of the PLE gene family.


July 7, 2019

BAC-pool sequencing and analysis confirms growth-associated QTLs in the Asian seabass genome.

The Asian seabass is an important marine food fish that has been cultured for several decades in Asia Pacific. However, the lack of a high quality reference genome has hampered efforts to improve its selective breeding. A 3D BAC pool set generated in this study was screened using 22 SSR markers located on linkage group 2 which contains a growth-related QTL region. Seventy-two clones corresponding to 22 FPC contigs were sequenced by Illumina MiSeq technology. We co-assembled the MiSeq-derived scaffolds from each FPC contig with error-corrected PacBio reads, resulting in 187 sequences covering 9.7?Mb. Eleven genes annotated within this region were found to be potentially associated with growth and their tissue-specific expression was investigated. Correlation analysis demonstrated that SNPs in ctsb, skp1 and ppp2ca can be potentially used as markers for selecting fast-growing fingerlings. Conserved syntenies between seabass LG2 and five other teleosts were identified. This study i) provided a 10?Mb targeted genome assembly; ii) demonstrated NGS of BAC pools as a potential approach for mining candidates underlying QTLs of this species; iii) detected eleven genes potentially responsible for growth in the QTL region; and iv) identified useful SNP markers for selective breeding programs of Asian seabass.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.