Menu
July 7, 2019  |  

The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies.

Theobroma cacao L., native to the Amazonian basin of South America, is an economically important fruit tree crop for tropical countries as a source of chocolate. The first draft genome of the species, from a Criollo cultivar, was published in 2011. Although a useful resource, some improvements are possible, including identifying misassemblies, reducing the number of scaffolds and gaps, and anchoring un-anchored sequences to the 10 chromosomes.We used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined four Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions and reduced the number of scaffolds. We then used genotyping by sequencing (GBS) methods to increase the proportion of the assembly anchored to chromosomes.The scaffold number decreased from 4,792 in assembly V1 to 554 in V2 while the scaffold N50 size has increased from 0.47 Mb in V1 to 6.5 Mb in V2. A total of 96.7% of the assembly was anchored to the 10 chromosomes compared to 66.8% in the previous version. Unknown sites (Ns) were reduced from 10.8% to 5.7%. In addition, we updated the functional annotations and performed a new RefSeq structural annotation based on RNAseq evidence.Theobroma cacao Criollo genome version 2 will be a valuable resource for the investigation of complex traits at the genomic level and for future comparative genomics and genetics studies in cacao tree. New functional tools and annotations are available on the Cocoa Genome Hub ( http://cocoa-genome-hub.southgreen.fr ).


July 7, 2019  |  

Genome architecture and evolution of a unichromosomal asexual nematode.

Asexual reproduction in animals, though rare, is the main or exclusive mode of reproduction in some long-lived lineages. The longevity of asexual clades may be correlated with the maintenance of heterozygosity by mechanisms that rearrange genomes and reduce recombination. Asexual species thus provide an opportunity to gain insight into the relationship between molecular changes, genome architecture, and cellular processes. Here we report the genome sequence of the parthenogenetic nematode Diploscapter pachys with only one chromosome pair. We show that this unichromosomal architecture is shared by a long-lived clade of asexual nematodes closely related to the genetic model organism Caenorhabditis elegans. Analysis of the genome assembly reveals that the unitary chromosome arose through fusion of six ancestral chromosomes, with extensive rearrangement among neighboring regions. Typical nematode telomeres and telomeric protection-encoding genes are lacking. Most regions show significant heterozygosity; homozygosity is largely concentrated to one region and attributed to gene conversion. Cell-biological and molecular evidence is consistent with the absence of key features of meiosis I, including synapsis and recombination. We propose that D. pachys preserves heterozygosity and produces diploid embryos without fertilization through a truncated meiosis. As a prelude to functional studies, we demonstrate that D. pachys is amenable to experimental manipulation by RNA interference. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 7, 2019  |  

New insights into structural organization and gene duplication in a 1.75-Mb genomic region harboring the a-gliadin gene family in Aegilops tauschii, the source of wheat D genome.

Among the wheat prolamins important for its end-use traits, a-gliadins are the most abundant, and are also a major cause of food-related allergies and intolerances. Previous studies of various wheat species estimated that between 25 and 150 a-gliadin genes reside in the Gli-2 locus regions. To better understand the evolution of this complex gene family, the DNA sequence of a 1.75-Mb genomic region spanning the Gli-2 locus was analyzed in the diploid grass, Aegilops tauschii, the ancestral source of D genome in hexaploid bread wheat. Comparison with orthologous regions from rice, sorghum, and Brachypodium revealed rapid and dynamic changes only occurring to the Ae. tauschii Gli-2 region, including insertions of high numbers of non-syntenic genes and a high rate of tandem gene duplications, the latter of which have given rise to 12 copies of a-gliadin genes clustered within a 550-kb region. Among them, five copies have undergone pseudogenization by various mutation events. Insights into the evolutionary relationship of the duplicated a-gliadin genes were obtained from their genomic organization, transcription patterns, transposable element insertions and phylogenetic analyses. An ancestral glutamate-like receptor (GLR) gene encoding putative amino acid sensor in all four grass species has duplicated only in Ae. tauschii and generated three more copies that are interspersed with the a-gliadin genes. Phylogenetic inference and different gene expression patterns support functional divergence of the Ae. tauschii GLR copies after duplication. Our results suggest that the duplicates of a-gliadin and GLR genes have likely taken different evolutionary paths; conservation for the former and neofunctionalization for the latter.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.


July 7, 2019  |  

Comparative genome analysis of programmed DNA elimination in nematodes.

Programmed DNA elimination is a developmentally regulated process leading to the reproducible loss of specific genomic sequences. DNA elimination occurs in unicellular ciliates and a variety of metazoans, including invertebrates and vertebrates. In metazoa, DNA elimination typically occurs in somatic cells during early development, leaving the germline genome intact. Reference genomes for metazoa that undergo DNA elimination are not available. Here, we generated germline and somatic reference genome sequences of the DNA eliminating pig parasitic nematode Ascaris suum and the horse parasite Parascaris univalens. In addition, we carried out in-depth analyses of DNA elimination in the parasitic nematode of humans, Ascaris lumbricoides, and the parasitic nematode of dogs, Toxocara canis. Our analysis of nematode DNA elimination reveals that in all species, repetitive sequences (that differ among the genera) and germline-expressed genes (approximately 1000-2000 or 5%-10% of the genes) are eliminated. Thirty-five percent of these eliminated genes are conserved among these nematodes, defining a core set of eliminated genes that are preferentially expressed during spermatogenesis. Our analysis supports the view that DNA elimination in nematodes silences germline-expressed genes. Over half of the chromosome break sites are conserved between Ascaris and Parascaris, whereas only 10% are conserved in the more divergent T. canis. Analysis of the chromosomal breakage regions suggests a sequence-independent mechanism for DNA breakage followed by telomere healing, with the formation of more accessible chromatin in the break regions prior to DNA elimination. Our genome assemblies and annotations also provide comprehensive resources for analysis of DNA elimination, parasitology research, and comparative nematode genome and epigenome studies.© 2017 Wang et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019  |  

Genome sequence of the small brown planthopper, Laodelphax striatellus.

Laodelphax striatellus Fallén (Hemiptera: Delphacidae) is one of the most destructive rice pests. L. striatellus is different from 2 other rice planthoppers with a released genome sequence, Sogatella furcifera and Nilaparvata lugens, in many biological characteristics, such as host range, dispersal capacity, and vectoring plant viruses. Deciphering the genome of L. striatellus will further the understanding of the genetic basis of the biological differences among the 3 rice planthoppers.A total of 190 Gb of Illumina data and 32.4 Gb of Pacbio data were generated and used to assemble a high-quality L. striatellus genome sequence, which is 541 Mb in length and has a contig N50 of 118 Kb and a scaffold N50 of 1.08 Mb. Annotated repetitive elements account for 25.7% of the genome. A total of 17?736 protein-coding genes were annotated, capturing 97.6% and 98% of the BUSCO eukaryote and arthropoda genes, respectively. Compared with N. lugens and S. furcifera, L. striatellus has the smallest genome and the lowest gene number. Gene family expansion and transcriptomic analyses provided hints to the genomic basis of the differences in important traits such as host range, migratory habit, and plant virus transmission between L. striatellus and the other 2 planthoppers.We report a high-quality genome assembly of L. striatellus, which is an important genomic resource not only for the study of the biology of L. striatellus and its interactions with plant hosts and plant viruses, but also for comparison with other planthoppers.© The Authors 2017. Published by Oxford University Press.


July 7, 2019  |  

Single molecule sequencing-guided scaffolding and correction of draft assemblies.

Although single molecule sequencing is still improving, the lengths of the generated sequences are inevitably an advantage in genome assembly. Prior work that utilizes long reads to conduct genome assembly has mostly focused on correcting sequencing errors and improving contiguity of de novo assemblies.We propose a disassembling-reassembling approach for both correcting structural errors in the draft assembly and scaffolding a target assembly based on error-corrected single molecule sequences. To achieve this goal, we formulate a maximum alternating path cover problem. We prove that this problem is NP-hard, and solve it by a 2-approximation algorithm.Our experimental results show that our approach can improve the structural correctness of target assemblies in the cost of some contiguity, even with smaller amounts of long reads. In addition, our reassembling process can also serve as a competitive scaffolder relative to well-established assembly benchmarks.


July 7, 2019  |  

Scaffolding of long read assemblies using long range contact information.

Long read technologies have revolutionized de novo genome assembly by generating contigs orders of magnitude longer than that of short read assemblies. Although assembly contiguity has increased, it usually does not reconstruct a full chromosome or an arm of the chromosome, resulting in an unfinished chromosome level assembly. To increase the contiguity of the assembly to the chromosome level, different strategies are used which exploit long range contact information between chromosomes in the genome.We develop a scalable and computationally efficient scaffolding method that can boost the assembly contiguity to a large extent using genome-wide chromatin interaction data such as Hi-C.we demonstrate an algorithm that uses Hi-C data for longer-range scaffolding of de novo long read genome assemblies. We tested our methods on the human and goat genome assemblies. We compare our scaffolds with the scaffolds generated by LACHESIS based on various metrics.Our new algorithm SALSA produces more accurate scaffolds compared to the existing state of the art method LACHESIS.


July 7, 2019  |  

Streptomyces thermoautotrophicus does not fix nitrogen.

Streptomyces thermoautotrophicus UBT1 has been described as a moderately thermophilic chemolithoautotroph with a novel nitrogenase enzyme that is oxygen-insensitive. We have cultured the UBT1 strain, and have isolated two new strains (H1 and P1-2) of very similar phenotypic and genetic characters. These strains show minimal growth on ammonium-free media, and fail to incorporate isotopically labeled N2 gas into biomass in multiple independent assays. The sdn genes previously published as the putative nitrogenase of S. thermoautotrophicus have little similarity to anything found in draft genome sequences, published here, for strains H1 and UBT1, but share >99% nucleotide identity with genes from Hydrogenibacillus schlegelii, a draft genome for which is also presented here. H. schlegelii similarly lacks nitrogenase genes and is a non-diazotroph. We propose reclassification of the species containing strains UBT1, H1, and P1-2 as a non-Streptomycete, non-diazotrophic, facultative chemolithoautotroph and conclude that the existence of the previously proposed oxygen-tolerant nitrogenase is extremely unlikely.


July 7, 2019  |  

Coevolution between Nuclear-encoded DNA replication, recombination, and repair genes and plastid genome complexity.

Disruption of DNA replication, recombination, and repair (DNA-RRR) systems has been hypothesized to cause highly elevated nucleotide substitution rates and genome rearrangements in the plastids of angiosperms, but this theory remains untested. To investigate nuclear-plastid genome (plastome) coevolution in Geraniaceae, four different measures of plastome complexity (rearrangements, repeats, nucleotide insertions/deletions, and substitution rates) were evaluated along with substitution rates of 12 nuclear-encoded, plastid-targeted DNA-RRR genes from 27 Geraniales species. Significant correlations were detected for nonsynonymous (dN) but not synonymous (dS) substitution rates for three DNA-RRR genes (uvrB/C, why1, and gyrA) supporting a role for these genes in accelerated plastid genome evolution in Geraniaceae. Furthermore, correlation between dN of uvrB/C and plastome complexity suggests the presence of nucleotide excision repair system in plastids. Significant correlations were also detected between plastome complexity and 13 of the 90 nuclear-encoded organelle-targeted genes investigated. Comparisons revealed significant acceleration of dN in plastid-targeted genes of Geraniales relative to Brassicales suggesting this correlation may be an artifact of elevated rates in this gene set in Geraniaceae. Correlation between dN of plastid-targeted DNA-RRR genes and plastome complexity supports the hypothesis that the aberrant patterns in angiosperm plastome evolution could be caused by dysfunction in DNA-RRR systems.© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019  |  

Insights into adaptations to a near-obligate nematode endoparasitic lifestyle from the finished genome of Drechmeria coniospora.

Nematophagous fungi employ three distinct predatory strategies: nematode trapping, parasitism of females and eggs, and endoparasitism. While endoparasites play key roles in controlling nematode populations in nature, their application for integrated pest management is hindered by the limited understanding of their biology. We present a comparative analysis of a high quality finished genome assembly of Drechmeria coniospora, a model endoparasitic nematophagous fungus, integrated with a transcriptomic study. Adaptation of D. coniospora to its almost completely obligate endoparasitic lifestyle led to the simplification of many orthologous gene families involved in the saprophytic trophic mode, while maintaining orthologs of most known fungal pathogen-host interaction proteins, stress response circuits and putative effectors of the small secreted protein type. The need to adhere to and penetrate the host cuticle led to a selective radiation of surface proteins and hydrolytic enzymes. Although the endoparasite has a simplified secondary metabolome, it produces a novel peptaibiotic family that shows antibacterial, antifungal and nematicidal activities. Our analyses emphasize the basic malleability of the D. coniospora genome: loss of genes advantageous for the saprophytic lifestyle; modulation of elements that its cohort species utilize for entomopathogenesis; and expansion of protein families necessary for the nematode endoparasitic lifestyle.


July 7, 2019  |  

BAC-pool sequencing and assembly of 19 Mb of the complex sugarcane genome.

Sequencing plant genomes are often challenging because of their complex architecture and high content of repetitive sequences. Sugarcane has one of the most complex genomes. It is highly polyploid, preserves intact homeologous chromosomes from its parental species and contains >55% repetitive sequences. Although bacterial artificial chromosome (BAC) libraries have emerged as an alternative for accessing the sugarcane genome, sequencing individual clones is laborious and expensive. Here, we present a strategy for sequencing and assembly reads produced from the DNA of pooled BAC clones. A set of 178 BAC clones, randomly sampled from the SP80-3280 sugarcane BAC library, was pooled and sequenced using the Illumina HiSeq2000 and PacBio platforms. A hybrid assembly strategy was used to generate 2,451 scaffolds comprising 19.2 MB of assembled genome sequence. Scaffolds of =20 Kb corresponded to 80% of the assembled sequences, and the full sequences of forty BACs were recovered in one or two contigs. Alignment of the BAC scaffolds with the chromosome sequences of sorghum showed a high degree of collinearity and gene order. The alignment of the BAC scaffolds to the 10 sorghum chromosomes suggests that the genome of the SP80-3280 sugarcane variety is ~19% contracted in relation to the sorghum genome. In conclusion, our data show that sequencing pools composed of high numbers of BAC clones may help to construct a reference scaffold map of the sugarcane genome.


July 7, 2019  |  

The Atlantic salmon genome provides insights into rediploidization.

The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.


July 7, 2019  |  

Haemonchus contortus: genome structure, organization and comparative genomics

One of the first genome sequencing projects for a parasitic nematode was that for Haemonchus contortus. The open access data from the Wellcome Trust Sanger Institute provided a valuable early resource for the research community, particularly for the identification of specific genes and genetic markers. Later, a second sequencing project was initiated by the University of Melbourne, and the two draft genome sequences for H. contortus were published back-to-back in 2013. There is a pressing need for long-range genomic information for genetic mapping, population genetics and functional genomic studies, so we are continuing to improve the Wellcome Trust Sanger Institute assembly to provide a finished reference genome for H. contortus. This review describes this process, compares the H. contortus genome assemblies with draft genomes from other members of the strongylid group and discusses future directions for parasite genomics using the H. contortus model. Copyright © 2016 Elsevier Ltd. All rights reserved.


July 7, 2019  |  

Dynamics of mutations during development of resistance by Pseudomonas aeruginosa against five antibiotics.

Pseudomonas aeruginosa is an opportunistic pathogen that causes considerable morbidity and mortality, specifically in the intensive care. Antibiotic resistant variants of this organism are more difficult to treat and cause substantial extra costs compared to susceptible strains. In the laboratory, P. aeruginosa rapidly developed resistance against five medically relevant antibiotics upon exposure to step-wise increasing concentrations. At several time points during the acquisition of resistance samples were taken for whole genome sequencing. The increase of MIC for ciprofloxacin was linked to specific mutations in gyrA, parC and gyrB, appearing sequentially. In the case of tobramycin, mutations were induced in fusA, HP02880, rplB and capD The MIC for the beta-lactam compounds meropenem, ceftazidime and the combination piperacillin/tazobactam correlated linearly with the beta-lactamase activity, but not always with individual mutations. The genes that were mutated during development of beta-lactam resistance differed for each antibiotic. A quantitative relationship between the frequency of mutations and the increase in resistance could not be established for any of the antibiotics. When the adapted strains are grown in the absence of the antibiotic, some mutations remained and others were reverted, but this reversal did not necessarily lower the MIC. The increased MIC came at the cost of moderately reduced cellular functions, or somewhat lower growth rate. In all cases except ciprofloxacin, the increase of resistance seems to be the result of a complex interaction between several cellular systems, rather than individual mutations. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


July 7, 2019  |  

OPERA-LG: efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees.

The assembly of large, repeat-rich eukaryotic genomes represents a significant challenge in genomics. While long-read technologies have made the high-quality assembly of small, microbial genomes increasingly feasible, data generation can be expensive for larger genomes. OPERA-LG is a scalable, exact algorithm for the scaffold assembly of large, repeat-rich genomes, out-performing state-of-the-art programs for scaffold correctness and contiguity. It provides a rigorous framework for scaffolding of repetitive sequences and a systematic approach for combining data from different second-generation and third-generation sequencing technologies. OPERA-LG provides an avenue for systematic augmentation and improvement of thousands of existing draft eukaryotic genome assemblies.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.