Menu
September 22, 2019

N6-methyladenine DNA modification in the human genome.

DNA N6-methyladenine (6mA) modification is the most prevalent DNA modification in prokaryotes, but whether it exists in human cells and whether it plays a role in human diseases remain enigmatic. Here, we showed that 6mA is extensively present in the human genome, and we cataloged 881,240 6mA sites accounting for ~0.051% of the total adenines. [G/C]AGG[C/T] was the most significantly associated motif with 6mA modification. 6mA sites were enriched in the coding regions and mark actively transcribed genes in human cells. DNA 6mA and N6-demethyladenine modification in the human genome were mediated by methyltransferase N6AMT1 and demethylase ALKBH1, respectively. The abundance of 6mA was significantly lower in cancers, accompanied by decreased N6AMT1 and increased ALKBH1 levels, and downregulation of 6mA modification levels promoted tumorigenesis. Collectively, our results demonstrate that DNA 6mA modification is extensively present in human cells and the decrease of genomic DNA 6mA promotes human tumorigenesis. Copyright © 2018 Elsevier Inc. All rights reserved.


September 22, 2019

npInv: accurate detection and genotyping of inversions using long read sub-alignment.

Detection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored.We present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats.The application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion.


September 22, 2019

Genomic variation among and within six Juglans species.

Genomic analysis in Juglans (walnuts) is expected to transform the breeding and agricultural production of both nuts and lumber. To that end, we report here the determination of reference sequences for six additional relatives of Juglans regia: Juglans sigillata (also from section Dioscaryon), Juglans nigra, Juglans microcarpa, Juglans hindsii (from section Rhysocaryon), Juglans cathayensis (from section Cardiocaryon), and the closely related Pterocarya stenoptera While these are ‘draft’ genomes, ranging in size between 640Mbp and 990Mbp, their contiguities and accuracies can support powerful annotations of genomic variation that are often the foundation of new avenues of research and breeding. We annotated nucleotide divergence and synteny by creating complete pairwise alignments of each reference genome to the remaining six. In addition, we have re-sequenced a sample of accessions from four Juglans species (including regia). The variation discovered in these surveys comprises a critical resource for experimentation and breeding, as well as a solid complementary annotation. To demonstrate the potential of these resources the structural and sequence variation in and around the polyphenol oxidase loci, PPO1 and PPO2 were investigated. As reported for other seed crops variation in this gene is implicated in the domestication of walnuts. The apparently Juglandaceae specific PPO1 duplicate shows accelerated divergence and an excess of amino acid replacement on the lineage leading to accessions of the domesticated nut crop species, Juglans regia and sigillata. Copyright © 2018 Stevens et al.


September 22, 2019

Complete genome sequencing and comparative genomic analysis of Helicobacter apodemus isolated from the wild Korean striped field mouse (Apodemus agrarius) for potential pathogenicity

The Helicobacter bacterial genus comprises of spiral-shaped gram-negative bacteria with flagella that colonize the gastro-intestinal (GI) tract of humans and various mammals (Solnick and Schauer, 2001). In particular, Helicobacter pylori was classified as a group 1 carcinogen by the International Agency for Research on Cancer (IARC) in 1994, and has been shown to occur with a high prevalence in humans, although this varies between geographical regions, ethnic groups, and various populations (Kusters et al., 2006; Goh et al., 2011). To date, more than 37 Helicobacter species have been identified in addition to H. pylori (Péré-Védrenne et al., 2017). Furthermore, non-H. pylori Helicobacters (NHPH) have been shown to infect both humans and animals, and NHPH infections are associated with intestinal carcinoma, and mucinous adenocarcinoma (Swennes et al., 2016). Despite the demonstrated association between NHPH and disease, most studies to date have investigated H. pylori in humans; thus, it is necessary to characterize NHPH and elucidate its role in the GI tract of wild rodents which are potential Helicobacter carriers (Taylor et al., 2007; Mladenova-Hristova et al., 2017).


September 22, 2019

A rapid method for directed gene knockout for screening in G0 zebrafish.

Zebrafish is a powerful model for forward genetics. Reverse genetic approaches are limited by the time required to generate stable mutant lines. We describe a system for gene knockout that consistently produces null phenotypes in G0 zebrafish. Yolk injection of sets of four CRISPR/Cas9 ribonucleoprotein complexes redundantly targeting a single gene recapitulated germline-transmitted knockout phenotypes in >90% of G0 embryos for each of 8 test genes. Early embryonic (6 hpf) and stable adult phenotypes were produced. Simultaneous multi-gene knockout was feasible but associated with toxicity in some cases. To facilitate use, we generated a lookup table of four-guide sets for 21,386 zebrafish genes and validated several. Using this resource, we targeted 50 cardiomyocyte transcriptional regulators and uncovered a role of zbtb16a in cardiac development. This system provides a platform for rapid screening of genes of interest in development, physiology, and disease models in zebrafish. Copyright © 2018 Elsevier Inc. All rights reserved.


September 22, 2019

MultiMotifMaker: a multi-thread tool for identifying DNA methylation motifs from Pacbio reads.

The methylation of DNA is important mechanism to control biological processes. Recently, the Pacbio SMRT technology provides a new way to identify base methylation in the genome. MotifMaker is a tool developed by Pacbio for discovering DNA methylation motifs from methylated DNA sequences. However, MotifMaker is single-threaded and computational expensive for identifying methylation motifs from large genomes. Here, we present an efficient motif finding algorithm (MultiMotifMaker) by implementing multi threads of the MotifMaker. The MultiMotifMaker, speeds up the motif search about 8-9 times on a 32 core computer comparing to MotifMaker. MultiMotifMaker makes it possible to identify methylation motifs from Pacbio reads for large genomes.


September 22, 2019

Validation of Genomic Structural Variants Through Long Sequencing Technologies.

Although numerous algorithms have been developed to identify large chromosomal rearrangements (i.e., genomic structural variants, SVs), there remains a dearth of approaches to evaluate their results. This is significant, as the accurate identification of SVs is still an outstanding problem whereby no single algorithm has been shown to be able to achieve high sensitivity and specificity across different classes of SVs. The method introduced in this chapter, VaPoR, is specifically designed to evaluate the accuracy of SV predictions using third-generation long sequences. This method uses a recurrence approach and collects direct evidence from raw reads thus avoiding computationally costly whole genome assembly. This chapter would describe in detail as how to apply this tool onto different data types.


September 22, 2019

PBHoover and CigarRoller: a method for confident haploid variant calling on Pacific Biosciences data and its application to heterogeneous population analysis

Motivation: Single Molecule Real-Time (SMRT) sequencing has important and underutilized advantages that amplification-based platforms lack. Lack of systematic error (e.g. GC-bias), complete de novo assembly (including large repetitive regions) without scaffolding, can be mentioned. SMRT sequencing, however suffers from high random error rate and low sequencing depth (older chemistries). Here, we introduce PBHoover, software that uses a heuristic calling algorithm in order to make base calls with high certainty in low coverage regions. This software is also capable of mixed population detection with high sensitivity. PBHoovertextquoterights CigarRoller attachment improves sequencing depth in low-coverage regions through CIGAR-string correction. Results: We tested both modules on 348 M.tuberculosis clinical isolates sequenced on C1 or C2 chemistries. On average, CigarRoller improved percentage of usable read count from 68.9% to 99.98% in C1 runs and from 50% to 99% in C2 runs. Using the greater depth provided by CigarRoller, PBHoover was able to make base and variant calls 99.95% concordant with Sanger calls (QV33). PBHoover also detected antibiotic-resistant subpopulations that went undetected by Sanger. Using C1 chemistry, subpopulations as small as 9% of the total colony can be detected by PBHoover. This provides the most sensitive amplification-free molecular method for heterogeneity analysis and is in line with phenotypic methodstextquoteright sensitivity. This sensitivity significantly improves with the greater depth and lower error rate of the newer chemistries. Availability and Implementation: Executables are freely available under GNU GPL v3+ at http://www.gitlab.com/LPCDRP/pbhoover and http://www.gitlab.com/LPCDRP/CigarRoller. PBHoover is also available on bioconda: https://anaconda.org/bioconda/pbhoover.


September 22, 2019

Sequencing of Panax notoginseng genome reveals genes involved in disease resistance and ginsenoside biosynthesis

Background: Panax notoginseng is a traditional Chinese herb with high medicinal and economic value. There has been considerable research on the pharmacological activities of ginsenosides contained in Panax spp.; however, very little is known about the ginsenoside biosynthetic pathway. Results: We reported the first de novo genome of 2.36 Gb of sequences from P. notoginseng with 35,451 protein-encoding genes. Compared to other plants, we found notable gene family contraction of disease-resistance genes in P. notoginseng, but notable expansion for several ATP-binding cassette (ABC) transporter subfamilies, such as the Gpdr subfamily, indicating that ABCs might be an additional mechanism for the plant to cope with biotic stress. Combining eight transcriptomes of roots and aerial parts, we identified several key genes, their transcription factor binding sites and all their family members involved in the synthesis pathway of ginsenosides in P. notoginseng, including dammarenediol synthase, CYP716 and UGT71. Conclusions: The complete genome analysis of P. notoginseng, the first in genus Panax, will serve as an important reference sequence for improving breeding and cultivation of this important nutraceutical and medicinal but vulnerable plant species.


September 22, 2019

Comparative genomics of Escherichia coli sequence type 219 clones from the same patient: Evolution of the IncI1 blaCMY-carrying plasmid in vivo.

This study investigates the evolution of an Escherichia coli sequence type 219 clone in a patient with recurrent urinary tract infection, comparing isolate EC974 obtained prior to antibiotic treatment and isolate EC1515 recovered after exposure to several ß-lactam antibiotics (ceftriaxone, cefixime, and imipenem). EC974 had a smooth colony morphology, while EC1515 had a rough colony morphology on sheep blood agar. RAPD-PCR analysis suggested that both isolates belonged to the same clone. Antimicrobial susceptibility tests showed that EC1515 was more resistant to piperacillin/tazobactam, cefepime, cefpirome, and ertapenem than EC974. Comparative genomic analysis was used to investigate the genetic changes of EC974 and EC1515 within the host, and showed three plasmids with replicons IncI1, P0111, and IncFII in both isolates. P0111-type plasmids pEC974-2 and pEC1515-2, contained the antibiotic resistance genes aadA2, tetA, and drfA12. IncFII-type plasmids pEC974-3 and pEC1515-3 contained the antibiotic resistance genes blaTEM-1, aadA1, aadA22, sul3, and inuF. Interestingly, blaCMY-111 and blaCMY-4 were found in very similar IncI1 plasmids that also contained aadA22 and aac(3)-IId, from isolates EC974 (pEC974-1) and EC1515 (pEC1515-1), respectively. The results showed in vivo amino acid substitutions converting blaCMY-111 to blaCMY-4 (R221W and A238V substitutions). Conjugation experiments showed a high frequency of IncI1 and IncFII plasmid co-transference. Transconjugants and DH5a cells harboring blaCMY-4 or blaCMY-111 showed higher levels of resistance to ampicillin, amoxicillin, cefazolin, cefuroxime, cefotaxime, cefixime, and ceftazidime, but not piperacillin/tazobactam, cefpime, or ertapenem. All known genes (outer membrane proteins and extended-spectrum AmpC ß-lactamases) involved in ETP resistance in E. coli were identical between EC974 and EC1515. This is the first study to identify the evolution of an IncI1 plasmid within the host, and to characterize blaCMY-111 in E. coli.


September 22, 2019

Plasmodium vivax-like genome sequences shed new insights into Plasmodium vivax biology and evolution.

Although Plasmodium vivax is responsible for the majority of malaria infections outside Africa, little is known about its evolution and pathway to humans. Its closest genetic relative, P. vivax-like, was discovered in African great apes and is hypothesized to have given rise to P. vivax in humans. To unravel the evolutionary history and adaptation of P. vivax to different host environments, we generated using long- and short-read sequence technologies 2 new P. vivax-like reference genomes and 9 additional P. vivax-like genotypes. Analyses show that the genomes of P. vivax and P. vivax-like are highly similar and colinear within the core regions. Phylogenetic analyses clearly show that P. vivax-like parasites form a genetically distinct clade from P. vivax. Concerning the relative divergence dating, we show that the evolution of P. vivax in humans did not occur at the same time as the other agents of human malaria, thus suggesting that the transfer of Plasmodium parasites to humans happened several times independently over the history of the Homo genus. We further identify several key genes that exhibit signatures of positive selection exclusively in the human P. vivax parasites. Two of these genes have been identified to also be under positive selection in the other main human malaria agent, P. falciparum, thus suggesting their key role in the evolution of the ability of these parasites to infect humans or their anthropophilic vectors. Finally, we demonstrate that some gene families important for red blood cell (RBC) invasion (a key step of the life cycle of these parasites) have undergone lineage-specific evolution in the human parasite (e.g., reticulocyte-binding proteins [RBPs]).


September 22, 2019

A reference genome of the Chinese hamster based on a hybrid assembly strategy.

Accurate and complete genome sequences are essential in biotechnology to facilitate genome-based cell engineering efforts. The current genome assemblies for Cricetulus griseus, the Chinese hamster, are fragmented and replete with gap sequences and misassemblies, consistent with most short-read-based assemblies. Here, we completely resequenced C. griseus using single molecule real time sequencing and merged this with Illumina-based assemblies. This generated a more contiguous and complete genome assembly than either technology alone, reducing the number of scaffolds by >28-fold, with 90% of the sequence in the 122 longest scaffolds. Most genes are now found in single scaffolds, including up- and downstream regulatory elements, enabling improved study of noncoding regions. With >95% of the gap sequence filled, important Chinese hamster ovary cell mutations have been detected in draft assembly gaps. This new assembly will be an invaluable resource for continued basic and pharmaceutical research.© 2018 The Authors. Biotechnology and Bioengineering Published by Wiley Periodicals, Inc.


September 22, 2019

Emergence of an XDR and carbapenemase-producing hypervirulent Klebsiella pneumoniae strain in Taiwan.

Carbapenemase-producing Klebsiella pneumoniae causes high mortality owing to the limited therapeutic options available. Here, we investigated an emergent carbapenem-resistant K. pneumoniae strain with hypervirulence found among KPC-2-producing strains in Taiwan.KPC-producing K. pneumoniae strains were collected consecutively from clinical specimens at the Taipei Veterans General Hospital between January 2012 and December 2014. Capsular types and the presence of rmpA/rmpA2 were analysed, and PFGE and MLST performed using these strains. The strain positive for rmpA/rmpA2 was tested in an in vivo mouse lethality study to verify its virulence and subjected to WGS to delineate its genomic features.A total of 62 KPC-2-producing K. pneumoniae strains were identified; all of these belonged to ST11 and capsular genotype K47. One strain isolated from a fatal case with intra-abdominal abscess (TVGHCRE225) harboured rmpA and rmpA2 genes. This strain was resistant to tigecycline and colistin, in addition to carbapenems, and did not belong to the major cluster in PFGE. TVGHCRE225 exhibited high in vivo virulence in the mouse lethality experiment. WGS showed that TVGHCRE225 acquired a novel hybrid virulence plasmid harbouring a set of virulence genes (iroBCDN, iucABCD, rmpA and rmpA2, and iutA) compared with the classic ST11 KPC-2-producing strain.We identified an XDR ST11 KPC-2-producing K. pneumoniae strain carrying a hybrid virulent plasmid in Taiwan. Active surveillance focusing on carbapenem-resistant hypervirulent K. pneumoniae strains is necessary, as the threat to human health is imminent.


September 22, 2019

Human copy number variants are enriched in regions of low mappability.

Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.


September 22, 2019

A synthetic-diploid benchmark for accurate variant-calling evaluation.

Existing benchmark datasets for use in evaluating variant-calling accuracy are constructed from a consensus of known short-variant callers, and they are thus biased toward easy regions that are accessible by these algorithms. We derived a new benchmark dataset from the de novo PacBio assemblies of two fully homozygous human cell lines, which provides a relatively more accurate and less biased estimate of small-variant-calling error rates in a realistic context.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.