Menu
July 7, 2019

XCAVATOR: accurate detection and genotyping of copy number variants from second and third generation whole-genome sequencing experiments.

We developed a novel software package, XCAVATOR, for the identification of genomic regions involved in copy number variants/alterations (CNVs/CNAs) from short and long reads whole-genome sequencing experiments.By using simulated and real datasets we showed that our tool, based on read count approach, is capable to predict the boundaries and the absolute number of DNA copies CNVs/CNAs with high resolutions. To demonstrate the power of our software we applied it to the analysis Illumina and Pacific Bioscencies data and we compared its performance to other ten state of the art tools.All the analyses we performed demonstrate that XCAVATOR is capable to detect germline and somatic CNVs/CNAs outperforming all the other tools we compared. XCAVATOR is freely available at http://sourceforge.net/projects/xcavator/ .


July 7, 2019

Building a locally diploid genome and transcriptome of the diatom Fragilariopsis cylindrus.

The genome of the cold-adapted diatom Fragilariopsis cylindrus is characterized by highly diverged haplotypes that intersperse its homozygous genome. Here, we describe how a combination of PacBio DNA and Illumina RNA sequencing can be used to resolve this complex genomic landscape locally into the highly diverged haplotypes, and how to map various environmentally controlled transcripts onto individual haplotypes. We assembled PacBio sequence data with the FALCON assembler and created a haplotype resolved annotation of the assembly using annotations of a Sanger sequenced F. cylindrus genome. RNA-seq datasets from six different growth conditions were used to resolve allele-specifc gene expression in F. cylindrus. This approach enables to study differential expression of alleles in a complex genomic landscape and provides a useful tool to study how diverged haplotypes in diploid organisms are used for adaptation and evolution to highly variable environments.


July 7, 2019

Echinochloa crus-galli genome analysis provides insight into its adaptation and invasiveness as a weed.

Barnyardgrass (Echinochloa crus-galli) is a pernicious weed in agricultural fields worldwide. The molecular mechanisms underlying its success in the absence of human intervention are presently unknown. Here we report a draft genome sequence of the hexaploid species E. crus-galli, i.e., a 1.27?Gb assembly representing 90.7% of the predicted genome size. An extremely large repertoire of genes encoding cytochrome P450 monooxygenases and glutathione S-transferases associated with detoxification are found. Two gene clusters involved in the biosynthesis of an allelochemical 2,4-dihydroxy-7-methoxy-1,4-benzoxazin-3-one (DIMBOA) and a phytoalexin momilactone A are found in the E. crus-galli genome, respectively. The allelochemical DIMBOA gene cluster is activated in response to co-cultivation with rice, while the phytoalexin momilactone A gene cluster specifically to infection by pathogenic Pyricularia oryzae. Our results provide a new understanding of the molecular mechanisms underlying the extreme adaptation of the weed.


July 7, 2019

Genome architecture and evolution of a unichromosomal asexual nematode.

Asexual reproduction in animals, though rare, is the main or exclusive mode of reproduction in some long-lived lineages. The longevity of asexual clades may be correlated with the maintenance of heterozygosity by mechanisms that rearrange genomes and reduce recombination. Asexual species thus provide an opportunity to gain insight into the relationship between molecular changes, genome architecture, and cellular processes. Here we report the genome sequence of the parthenogenetic nematode Diploscapter pachys with only one chromosome pair. We show that this unichromosomal architecture is shared by a long-lived clade of asexual nematodes closely related to the genetic model organism Caenorhabditis elegans. Analysis of the genome assembly reveals that the unitary chromosome arose through fusion of six ancestral chromosomes, with extensive rearrangement among neighboring regions. Typical nematode telomeres and telomeric protection-encoding genes are lacking. Most regions show significant heterozygosity; homozygosity is largely concentrated to one region and attributed to gene conversion. Cell-biological and molecular evidence is consistent with the absence of key features of meiosis I, including synapsis and recombination. We propose that D. pachys preserves heterozygosity and produces diploid embryos without fertilization through a truncated meiosis. As a prelude to functional studies, we demonstrate that D. pachys is amenable to experimental manipulation by RNA interference. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 7, 2019

Methylation-dependent DNA discrimination in natural transformation of Campylobacter jejuni.

Campylobacter jejuni, a leading cause of bacterial gastroenteritis, is naturally competent. Like many competent organisms, C. jejuni restricts the DNA that can be used for transformation to minimize undesirable changes in the chromosome. Although C. jejuni can be transformed by C. jejuni-derived DNA, it is poorly transformed by the same DNA propagated in Escherichia coli or produced with PCR. Our work indicates that methylation plays an important role in marking DNA for transformation. We have identified a highly conserved DNA methyltransferase, which we term Campylobacter transformation system methyltransferase (ctsM), which methylates an overrepresented 6-bp sequence in the chromosome. DNA derived from a ctsM mutant transforms C. jejuni significantly less well than DNA derived from ctsM(+) (parental) cells. The ctsM mutation itself does not affect transformation efficiency when parental DNA is used, suggesting that CtsM is important for marking transforming DNA, but not for transformation itself. The mutant has no growth defect, arguing against ongoing restriction of its own DNA. We further show that E. coli plasmid and PCR-derived DNA can efficiently transform C. jejuni when only a subset of the CtsM sites are methylated in vitro. A single methylation event 1 kb upstream of the DNA involved in homologous recombination is sufficient to transform C. jejuni, whereas otherwise identical unmethylated DNA is not. Methylation influences DNA uptake, with a slight effect also seen on DNA binding. This mechanism of DNA discrimination in C. jejuni is distinct from the DNA discrimination described in other competent bacteria.


July 7, 2019

Shared features of cryptic plasmids from environmental and pathogenic Francisella species.

The Francisella genus includes several recognized species, additional potential species, and other representatives that inhabit a range of incredibly diverse ecological niches, but are not closely related to the named species. Francisella species have been obtained from a wide variety of clinical and environmental sources; documented species include highly virulent human and animal pathogens, fish pathogens, opportunistic human pathogens, tick endosymbionts, and free-living isolates inhabiting brackish water. While more than 120 Francisella genomes have been sequenced to date, only a few contain plasmids, and most of these appear to be cryptic, with unknown benefit to the host cell. We have identified several putative cryptic plasmids in the sequenced genomes of three Francisella novicida and F. novicida-like strains (TX07-6608, AZ06-7470, DPG_3A-IS) and two new Francisella species (F. frigiditurris CA97-1460 and F. opportunistica MA06-7296). These plasmids were compared to each other and to previously identified plasmids from other Francisella species. Some of the plasmids encoded functions potentially involved in replication, conjugal transfer and partitioning, environmental survival (transcriptional regulation, signaling, metabolism), and hypothetical proteins with no assignable functions. Genomic and phylogenetic comparisons of these new plasmids to the other known Francisella plasmids revealed some similarities that add to our understanding of the evolutionary relationships among the diverse Francisella species.


July 7, 2019

The sea cucumber genome provides insights into morphological evolution and visceral regeneration.

Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs.


July 7, 2019

Dense and accurate whole-chromosome haplotyping of individual genomes.

The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate chromosome-length haplotypes at reasonable costs. Here we introduce an integrative phasing strategy that combines global, but sparse haplotypes obtained from strand-specific single-cell sequencing (Strand-seq) with dense, yet local, haplotype information available through long-read or linked-read sequencing. We provide comprehensive guidance on the required sequencing depths and reliably assign more than 95% of alleles (NA12878) to their parental haplotypes using as few as 10 Strand-seq libraries in combination with 10-fold coverage PacBio data or, alternatively, 10X Genomics linked-read sequencing data. We conclude that the combination of Strand-seq with different technologies represents an attractive solution to chart the genetic variation of diploid genomes.


July 7, 2019

Hybrid de novo genome assembly and centromere characterization of the gray mouse lemur (Microcebus murinus).

The de novo assembly of repeat-rich mammalian genomes using only high-throughput short read sequencing data typically results in highly fragmented genome assemblies that limit downstream applications. Here, we present an iterative approach to hybrid de novo genome assembly that incorporates datasets stemming from multiple genomic technologies and methods. We used this approach to improve the gray mouse lemur (Microcebus murinus) genome from early draft status to a near chromosome-scale assembly.We used a combination of advanced genomic technologies to iteratively resolve conflicts and super-scaffold the M. murinus genome.We improved the M. murinus genome assembly to a scaffold N50 of 93.32 Mb. Whole genome alignments between our primary super-scaffolds and 23 human chromosomes revealed patterns that are congruent with historical comparative cytogenetic data, thus demonstrating the accuracy of our de novo scaffolding approach and allowing assignment of scaffolds to M. murinus chromosomes. Moreover, we utilized our independent datasets to discover and characterize sequences associated with centromeres across the mouse lemur genome. Quality assessment of the final assembly found 96% of mouse lemur canonical transcripts nearly complete, comparable to other published high-quality reference genome assemblies.We describe a new assembly of the gray mouse lemur (Microcebus murinus) genome with chromosome-scale scaffolds produced using a hybrid bioinformatic and sequencing approach. The approach is cost effective and produces superior results based on metrics of contiguity and completeness. Our results show that emerging genomic technologies can be used in combination to characterize centromeres of non-model species and to produce accurate de novo chromosome-scale genome assemblies of complex mammalian genomes.


July 7, 2019

Comparative genome analysis of programmed DNA elimination in nematodes.

Programmed DNA elimination is a developmentally regulated process leading to the reproducible loss of specific genomic sequences. DNA elimination occurs in unicellular ciliates and a variety of metazoans, including invertebrates and vertebrates. In metazoa, DNA elimination typically occurs in somatic cells during early development, leaving the germline genome intact. Reference genomes for metazoa that undergo DNA elimination are not available. Here, we generated germline and somatic reference genome sequences of the DNA eliminating pig parasitic nematode Ascaris suum and the horse parasite Parascaris univalens. In addition, we carried out in-depth analyses of DNA elimination in the parasitic nematode of humans, Ascaris lumbricoides, and the parasitic nematode of dogs, Toxocara canis. Our analysis of nematode DNA elimination reveals that in all species, repetitive sequences (that differ among the genera) and germline-expressed genes (approximately 1000-2000 or 5%-10% of the genes) are eliminated. Thirty-five percent of these eliminated genes are conserved among these nematodes, defining a core set of eliminated genes that are preferentially expressed during spermatogenesis. Our analysis supports the view that DNA elimination in nematodes silences germline-expressed genes. Over half of the chromosome break sites are conserved between Ascaris and Parascaris, whereas only 10% are conserved in the more divergent T. canis. Analysis of the chromosomal breakage regions suggests a sequence-independent mechanism for DNA breakage followed by telomere healing, with the formation of more accessible chromatin in the break regions prior to DNA elimination. Our genome assemblies and annotations also provide comprehensive resources for analysis of DNA elimination, parasitology research, and comparative nematode genome and epigenome studies.© 2017 Wang et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Detection of complex structural variation from paired-end sequencing data

Detecting structural variants (SVs) from sequencing data is a key problem in genome analysis, but the full diversity of SVs is not captured by most methods. We introduce the Automated Reconstruction of Complex Structural Variants (ARC-SV) method, which detects a broad class of structural variants from paired-end whole genome sequencing (WGS) data. Analysis of samples from NA12878 and HuRef suggests that complex SVs are often misclassified by traditional methods. We validated our results both experimentally and by comparison to whole genome assembly and PacBio data; ARC-SV compares favorably to existing algorithms in general and gives state-of-the-art results on complex SV detection. By expanding the range of detectable SVs compared to commonly-used algorithms, ARC-SV allows additional information to be extracted from existing WGS data.


July 7, 2019

Filling reference gaps via assembling DNA barcodes using high-throughput sequencing-moving toward barcoding the world.

Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)-based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn’t show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes.© The Authors 2017. Published by Oxford University Press.


July 7, 2019

Genomics of parallel adaptation at two timescales in Drosophila.

Two interesting unanswered questions are the extent to which both the broad patterns and genetic details of adaptive divergence are repeatable across species, and the timescales over which parallel adaptation may be observed. Drosophila melanogaster is a key model system for population and evolutionary genomics. Findings from genetics and genomics suggest that recent adaptation to latitudinal environmental variation (on the timescale of hundreds or thousands of years) associated with Out-of-Africa colonization plays an important role in maintaining biological variation in the species. Additionally, studies of interspecific differences between D. melanogaster and its sister species D. simulans have revealed that a substantial proportion of proteins and amino acid residues exhibit adaptive divergence on a roughly few million years long timescale. Here we use population genomic approaches to attack the problem of parallelism between D. melanogaster and a highly diverged conger, D. hydei, on two timescales. D. hydei, a member of the repleta group of Drosophila, is similar to D. melanogaster, in that it too appears to be a recently cosmopolitan species and recent colonizer of high latitude environments. We observed parallelism both for genes exhibiting latitudinal allele frequency differentiation within species and for genes exhibiting recurrent adaptive protein divergence between species. Greater parallelism was observed for long-term adaptive protein evolution and this parallelism includes not only the specific genes/proteins that exhibit adaptive evolution, but extends even to the magnitudes of the selective effects on interspecific protein differences. Thus, despite the roughly 50 million years of time separating D. melanogaster and D. hydei, and despite their considerably divergent biology, they exhibit substantial parallelism, suggesting the existence of a fundamental predictability of adaptive evolution in the genus.


July 7, 2019

Remarkable diversity of Escherichia coli carrying mcr-1 from hospital sewage with the identification of two new mcr-1 variants.

The plasmid-borne colistin-resistant gene mcr-1 has rapidly become a worldwide public health concern. This study aims to determine the host bacterial strains, plasmids, and genetic contexts of mcr-1 in hospital sewage. A 1-ml hospital sewage sample was cultured. Colistin-resistant bacterial colonies were selected on agar plates and were subjected to whole genome sequencing and subsequent analysis. The transfer of mcr-1 between bacterial strains was tested using conjugation. New variants of mcr-1 were cloned to test the impact of variations on the function of mcr-1. Plasmids carrying mcr-1 were retrieved from GenBank for comparison based on concatenated backbone genes. In the sewage sample, we observed that mcr-1 was located in various genetic contexts on the chromosome, or plasmids of four different replicon types (IncHI2, IncI2, IncP, and IncX4), in Klebsiella pneumoniae, Kluyvera spp. and seven Escherichia coli strains of six different sequence types (ST10, ST34, ST48, ST1196, ST7086, and ST7087). We also identified two new variants of mcr-1, mcr-1.4 and mcr-1.7, both of which encode an amino acid variation from mcr-1. mcr-1-carrying IncX4 plasmids, which have a global distribution across the Enterobacteriaceae, are the result of global dissemination of a single common plasmid, while IncI2 mcr-1 plasmids appear to acquire mcr-1 in multiple events. In conclusion, the unprecedented remarkable diversity of species, strains, plasmids, and genetic contexts carrying mcr-1 present in a single sewage sample from a single healthcare site highlights the continued evolution and dynamic transmission of mcr-1 in healthcare-associated environments.


July 7, 2019

Scallop genome reveals molecular adaptations to semi-sessile life and neurotoxins.

Bivalve molluscs are descendants of an early-Cambrian lineage superbly adapted to benthic filter feeding. Adaptations in form and behavior are well recognized, but the underlying molecular mechanisms are largely unknown. Here, we investigate the genome, various transcriptomes, and proteomes of the scallop Chlamys farreri, a semi-sessile bivalve with well-developed adductor muscle, sophisticated eyes, and remarkable neurotoxin resistance. The scallop’s large striated muscle is energy-dynamic but not fully differentiated from smooth muscle. Its eyes are supported by highly diverse, intronless opsins expanded by retroposition for broadened spectral sensitivity. Rapid byssal secretion is enabled by a specialized foot and multiple proteins including expanded tyrosinases. The scallop uses hepatopancreas to accumulate neurotoxins and kidney to transform to high-toxicity forms through expanded sulfotransferases, probably as deterrence against predation, while it achieves neurotoxin resistance through point mutations in sodium channels. These findings suggest that expansion and mutation of those genes may have profound effects on scallop’s phenotype and adaptation.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.