Menu
July 7, 2019

The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology.

Mobile element insertions (MEIs) represent ~25% of all structural variants in human genomes. Moreover, when they disrupt genes, MEIs can influence human traits and diseases. Therefore, MEIs should be fully discovered along with other forms of genetic variation in whole genome sequencing (WGS) projects involving population genetics, human diseases, and clinical genomics. Here, we describe the Mobile Element Locator Tool (MELT), which was developed as part of the 1000 Genomes Project to perform MEI discovery on a population scale. Using both Illumina WGS data and simulations, we demonstrate that MELT outperforms existing MEI discovery tools in terms of speed, scalability, specificity, and sensitivity, while also detecting a broader spectrum of MEI-associated features. Several run modes were developed to perform MEI discovery on local and cloud systems. In addition to using MELT to discover MEIs in modern humans as part of the 1000 Genomes Project, we also used it to discover MEIs in chimpanzees and ancient (Neanderthal and Denisovan) hominids. We detected diverse patterns of MEI stratification across these populations that likely were caused by (1) diverse rates of MEI production from source elements, (2) diverse patterns of MEI inheritance, and (3) the introgression of ancient MEIs into modern human genomes. Overall, our study provides the most comprehensive map of MEIs to date spanning chimpanzees, ancient hominids, and modern humans and reveals new aspects of MEI biology in these lineages. We also demonstrate that MELT is a robust platform for MEI discovery and analysis in a variety of experimental settings.© 2017 Gardner et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Multiple hybrid de novo genome assembly of finger millet, an orphan allotetraploid crop.

Finger millet (Eleusine coracana (L.) Gaertn) is an important crop for food security because of its tolerance to drought, which is expected to be exacerbated by global climate changes. Nevertheless, it is often classified as an orphan/underutilized crop because of the paucity of scientific attention. Among several small millets, finger millet is considered as an excellent source of essential nutrient elements, such as iron and zinc; hence, it has potential as an alternate coarse cereal. However, high-quality genome sequence data of finger millet are currently not available. One of the major problems encountered in the genome assembly of this species was its polyploidy, which hampers genome assembly compared with a diploid genome. To overcome this problem, we sequenced its genome using diverse technologies with sufficient coverage and assembled it via a novel multiple hybrid assembly workflow that combines next-generation with single-molecule sequencing, followed by whole-genome optical mapping using the Bionano Irys® system. The total number of scaffolds was 1,897 with an N50 length?>2.6?Mb and detection of 96% of the universal single-copy orthologs. The majority of the homeologs were assembled separately. This indicates that the proposed workflow is applicable to the assembly of other allotetraploid genomes.© The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.


July 7, 2019

SureMap: Versatile, error tolerant, and high sensitive read mapper

SureMap is a versatile, error tolerant and high sensitive read mapper which is able to map “difficult” reads, those requiring many edit operations to be mapped to the reference genome, with acceptable time complexity. Mapping real datasets reveal that many variants unidentifiable by other mappers can be called using Suremap. Moreover, SureMap has a very good running time and accuracy in aligning very long and noisy reads like PacBio and Nanopore against a reference genome.


July 7, 2019

Echinobase: an expanding resource for echinoderm genomic information

Echinobase, a web accessible information system of diverse genomics and biological data for the echinoderm clade, grew out of SpBase, the first echinoderm genome project for sea urchin, Strongylocentrotus purpuratus. Sea urchins and their relatives are utilitarian research models in fields ranging from marine biology to developmental biology and gene regulatory systems. Echinobase is a user-friendly web interface that links an array of biological data that would otherwise have been tedious and frustrating for researchers to extract and organize. The system hosts a powerful gene search engine, genomics browser and other bioinformatics tools to investigate genomics and high throughput data. The Echinobase information system now serves genomic information for eight echinoderm species: S. purpuratus, Strongylocentrotus fransciscanus, Allocentrotus fragilis, Lytechinus variegatus, Patiria miniata, Parastichopus parvimensis and Ophiothrix spiculata, Eucidaris tribuloides. Herein lies a description of the web information system, genomics data types and content hosted by Echinobase.org. The goal of Echinobase is to connect genomic information to various experimental data and accelerate the research in field of molecular biology, developmental process, gene regulatory networks and more recently engineering biological systems0.


July 7, 2019

Structural variation offers new home for disease associations and gene discovery

Following completion of the Human Genome Project, most studies of human genetic variation have centered on single nucleotide polymorphisms (SNPs). SNPs are numerous in individual genomes and serve as useful genetic markers in association studies across a population. These markers have been leveraged to identify genetic loci for disease risk and draw associations with numerous traits of interest. Despite their usefulness, SNPs do not tell the whole story. For example, most SNPs are associated with only a small increased risk of disease, and they usually cannot identify on their own which genes are causal. This has resulted in what many researchers have referred to as missing or hidden heritability.


July 7, 2019

Lightning-fast genome variant detection with GROM.

Current human whole genome sequencing projects produce massive amounts of data, often creating significant computational challenges. Different approaches have been developed for each type of genome variant and method of its detection, necessitating users to run multiple algorithms to find variants.We present GROM (Genome Rearrangement OmniMapper), a novel comprehensive variant detection algorithm accepting aligned read files as input and finding SNVs, indels, structural variants (SVs), and copy number variants (CNVs). We show that GROM outperforms state-of-the-art methods on seven validated benchmarks using two whole genome sequencing (WGS) datasets. Additionally, GROM boasts lightning fast run times, analyzing a 50x WGS human dataset (NA12878) on commonly available computer hardware in 11 minutes, more than an order of magnitude (up to 72 times) faster than tools detecting a similar range of variants.Addressing the needs of big data analysis, GROM combines in one algorithm SNV, indel, SV, and CNV detection providing superior speed, sensitivity, and precision. GROM is also able to detect CNVs, SNVs and indels in non-paired read WGS libraries, as well as SNVs and indels in whole exome or RNA sequencing datasets.


July 7, 2019

Exception to the rule: Genomic characterization of naturally occurring unusual Vibrio cholerae strains with a single chromosome.

The genetic make-up of most bacteria is encoded in a single chromosome while about 10% have more than one chromosome. Among these, Vibrio cholerae, with two chromosomes, has served as a model system to study various aspects of chromosome maintenance, mainly replication, and faithful partitioning of multipartite genomes. Here, we describe the genomic characterization of strains that are an exception to the two chromosome rules: naturally occurring single-chromosome V. cholerae. Whole genome sequence analyses of NSCV1 and NSCV2 (natural single-chromosome vibrio) revealed that the Chr1 and Chr2 fusion junctions contain prophages, IS elements, and direct repeats, in addition to large-scale chromosomal rearrangements such as inversions, insertions, and long tandem repeats elsewhere in the chromosome compared to prototypical two chromosome V. cholerae genomes. Many of the known cholera virulence factors are absent. The two origins of replication and associated genes are generally intact with synonymous mutations in some genes, as are recA and mismatch repair (MMR) genes dam, mutH, and mutL; MutS function is probably impaired in NSCV2. These strains are ideal tools for studying mechanistic aspects of maintenance of chromosomes with multiple origins and other rearrangements and the biological, functional, and evolutionary significance of multipartite genome architecture in general.


July 7, 2019

Complete genome analysis of Lactobacillus fermentum SK152 from kimchi reveals genes associated with its antimicrobial activity.

Research findings on probiotics highlight their importance in repressing harmful bacteria, leading to more extensive research on their potential applications. We analysed the genome of Lactobacillus fermentum SK152, which was isolated from the Korean traditional fermented vegetable dish kimchi, to determine the genetic makeup and genetic factors responsible for the antimicrobial activity of L. fermentum SK152 and performed a comparative genome analysis with other L. fermentum strains. The genome of L. fermentum SK152 was found to comprise a complete circular chromosome of 2092 273 bp, with an estimated GC content of 51.9% and 2184 open reading frames. It consisted of 2038 protein-coding genes and 73 RNA-coding genes. Moreover, a gene encoding a putative endolysin was found. A comparative genome analysis with other L. fermentum strains showed that SK152 is closely related to L. fermentum 3872 and F-6. An evolutionary analysis identified five positively selected genes that encode proteins associated with transport, survival and stress resistance. These positively selected genes may be essential for L. fermentum to colonise and survive in the stringent environment of the human gut and exert its beneficial effects. Our findings highlight the potential benefits of SK152.© FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

Identification of low allele frequency mosaic mutations in Alzheimer disease

Germline mutations ofAPP,PSEN1, andPSEN2 genes cause autosomal dominant Alzheimer disease (AD). Somatic variants of the same genes may underlie pathogenesis in sporadic AD, which is the most prevalent form of the disease. Importantly, such somatic variants may be present at very low allelic frequency, confined to the brain, and are thus very difficult or impossible to detect in blood-derived DNA. Ever-refined methodologies to identify mutations present in a fraction of the DNA of the original tissue are rapidly transforming our understanding of DNA mutation and their role in complex pathologies such as tumors. These methods stand poised to test to what extend somatic variants may play a role in AD and other neurodegenerative diseases.


July 7, 2019

Harnessing whole genome sequencing in medical mycology.

Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens.Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host.Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.


July 7, 2019

Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods.

Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS(YA), replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.


July 7, 2019

Shared features of cryptic plasmids from environmental and pathogenic Francisella species.

The Francisella genus includes several recognized species, additional potential species, and other representatives that inhabit a range of incredibly diverse ecological niches, but are not closely related to the named species. Francisella species have been obtained from a wide variety of clinical and environmental sources; documented species include highly virulent human and animal pathogens, fish pathogens, opportunistic human pathogens, tick endosymbionts, and free-living isolates inhabiting brackish water. While more than 120 Francisella genomes have been sequenced to date, only a few contain plasmids, and most of these appear to be cryptic, with unknown benefit to the host cell. We have identified several putative cryptic plasmids in the sequenced genomes of three Francisella novicida and F. novicida-like strains (TX07-6608, AZ06-7470, DPG_3A-IS) and two new Francisella species (F. frigiditurris CA97-1460 and F. opportunistica MA06-7296). These plasmids were compared to each other and to previously identified plasmids from other Francisella species. Some of the plasmids encoded functions potentially involved in replication, conjugal transfer and partitioning, environmental survival (transcriptional regulation, signaling, metabolism), and hypothetical proteins with no assignable functions. Genomic and phylogenetic comparisons of these new plasmids to the other known Francisella plasmids revealed some similarities that add to our understanding of the evolutionary relationships among the diverse Francisella species.


July 7, 2019

Avoidance of APOBEC3B-induced mutation by error-free lesion bypass.

APOBEC cytidine deaminases mutate cancer genomes by converting cytidines into uridines within ssDNA during replication. Although uracil DNA glycosylases limit APOBEC-induced mutation, it is unknown if subsequent base excision repair (BER) steps function on replication-associated ssDNA. Hence, we measured APOBEC3B-induced CAN1 mutation frequencies in yeast deficient in BER endonucleases or DNA damage tolerance proteins. Strains lacking Apn1, Apn2, Ntg1, Ntg2 or Rev3 displayed wild-type frequencies of APOBEC3B-induced canavanine resistance (CanR). However, strains without error-free lesion bypass proteins Ubc13, Mms2 and Mph1 displayed respective 4.9-, 2.8- and 7.8-fold higher frequency of APOBEC3B-induced CanR. These results indicate that mutations resulting from APOBEC activity are avoided by deoxyuridine conversion to abasic sites ahead of nascent lagging strand DNA synthesis and subsequent bypass by error-free template switching. We found this mechanism also functions during telomere re-synthesis, but with a diminished requirement for Ubc13. Interestingly, reduction of G to C substitutions in Ubc13-deficient strains uncovered a previously unknown role of Ubc13 in controlling the activity of the translesion synthesis polymerase, Rev1. Our results highlight a novel mechanism for error-free bypass of deoxyuridines generated within ssDNA and suggest that the APOBEC mutation signature observed in cancer genomes may under-represent the genomic damage these enzymes induce.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Length-independent DNA packing into nanopore zero-mode waveguides for low-input DNA sequencing.

Compared with conventional methods, single-molecule real-time (SMRT) DNA sequencing exhibits longer read lengths than conventional methods, less GC bias, and the ability to read DNA base modifications. However, reading DNA sequence from sub-nanogram quantities is impractical owing to inefficient delivery of DNA molecules into the confines of zero-mode waveguides-zeptolitre optical cavities in which DNA sequencing proceeds. Here, we show that the efficiency of voltage-induced DNA loading into waveguides equipped with nanopores at their floors is five orders of magnitude greater than existing methods. In addition, we find that DNA loading is nearly length-independent, unlike diffusive loading, which is biased towards shorter fragments. We demonstrate here loading and proof-of-principle four-colour sequence readout of a polymerase-bound 20,000-base-pair-long DNA template within seconds from a sub-nanogram input quantity, a step towards low-input DNA sequencing and mammalian epigenomic mapping of native DNA samples.


July 7, 2019

Genome-wide discovery of genes required for capsule production by uropathogenic Escherichia coli.

Uropathogenic Escherichia coli (UPEC) is a major cause of urinary tract and bloodstream infections and possesses an array of virulence factors for colonization, survival, and persistence. One such factor is the polysaccharide K capsule. Among the different K capsule types, the K1 serotype is strongly associated with UPEC infection. In this study, we completely sequenced the K1 UPEC urosepsis strain PA45B and employed a novel combination of a lytic K1 capsule-specific phage, saturated Tn5 transposon mutagenesis, and high-throughput transposon-directed insertion site sequencing (TraDIS) to identify the complement of genes required for capsule production. Our analysis identified known genes involved in capsule biosynthesis, as well as two additional regulatory genes (mprA and lrhA) that we characterized at the molecular level. Mutation of mprA resulted in protection against K1 phage-mediated killing, a phenotype restored by complementation. We also identified a significantly increased unidirectional Tn5 insertion frequency upstream of the lrhA gene and showed that strong expression of LrhA induced by a constitutive Pcl promoter led to loss of capsule production. Further analysis revealed loss of MprA or overexpression of LrhA affected the transcription of capsule biosynthesis genes in PA45B and increased sensitivity to killing in whole blood. Similar phenotypes were also observed in UPEC strains UTI89 (K1) and CFT073 (K2), demonstrating that the effects were neither strain nor capsule type specific. Overall, this study defined the genome of a UPEC urosepsis isolate and identified and characterized two new regulatory factors that affect UPEC capsule production.IMPORTANCE Urinary tract infections (UTIs) are among the most common bacterial infections in humans and are primarily caused by uropathogenic Escherichia coli (UPEC). Many UPEC strains express a polysaccharide K capsule that provides protection against host innate immune factors and contributes to survival and persistence during infection. The K1 serotype is one example of a polysaccharide capsule type and is strongly associated with UPEC strains that cause UTIs, bloodstream infections, and meningitis. The number of UTIs caused by antibiotic-resistant UPEC is steadily increasing, highlighting the need to better understand factors (e.g., the capsule) that contribute to UPEC pathogenesis. This study describes the original and novel application of lytic capsule-specific phage killing, saturated Tn5 transposon mutagenesis, and high-throughput transposon-directed insertion site sequencing to define the entire complement of genes required for capsule production in UPEC. Our comprehensive approach uncovered new genes involved in the regulation of this key virulence determinant. Copyright © 2017 Goh et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.