Menu
July 7, 2019

The Apostasia genome and the evolution of orchids.

Constituting approximately 10% of flowering plant species, orchids (Orchidaceae) display unique flower morphologies, possess an extraordinary diversity in lifestyle, and have successfully colonized almost every habitat on Earth. Here we report the draft genome sequence of Apostasia shenzhenica, a representative of one of two genera that form a sister lineage to the rest of the Orchidaceae, providing a reference for inferring the genome content and structure of the most recent common ancestor of all extant orchids and improving our understanding of their origins and evolution. In addition, we present transcriptome data for representatives of Vanilloideae, Cypripedioideae and Orchidoideae, and novel third-generation genome data for two species of Epidendroideae, covering all five orchid subfamilies. A. shenzhenica shows clear evidence of a whole-genome duplication, which is shared by all orchids and occurred shortly before their divergence. Comparisons between A. shenzhenica and other orchids and angiosperms also permitted the reconstruction of an ancestral orchid gene toolkit. We identify new gene families, gene family expansions and contractions, and changes within MADS-box gene classes, which control a diverse suite of developmental processes, during orchid evolution. This study sheds new light on the genetic mechanisms underpinning key orchid innovations, including the development of the labellum and gynostemium, pollinia, and seeds without endosperm, as well as the evolution of epiphytism; reveals relationships between the Orchidaceae subfamilies; and helps clarify the evolutionary history of orchids within the angiosperms.


July 7, 2019

Single-molecule sequencing and Hi-C-based proximity-guided assembly of amaranth (Amaranthus hypochondriacus) chromosomes provide insights into genome evolution.

Amaranth (Amaranthus hypochondriacus) was a food staple among the ancient civilizations of Central and South America that has recently received increased attention due to the high nutritional value of the seeds, with the potential to help alleviate malnutrition and food security concerns, particularly in arid and semiarid regions of the developing world. Here, we present a reference-quality assembly of the amaranth genome which will assist the agronomic development of the species.Utilizing single-molecule, real-time sequencing (Pacific Biosciences) and chromatin interaction mapping (Hi-C) to close assembly gaps and scaffold contigs, respectively, we improved our previously reported Illumina-based assembly to produce a chromosome-scale assembly with a scaffold N50 of 24.4 Mb. The 16 largest scaffolds contain 98% of the assembly and likely represent the haploid chromosomes (n?=?16). To demonstrate the accuracy and utility of this approach, we produced physical and genetic maps and identified candidate genes for the betalain pigmentation pathway. The chromosome-scale assembly facilitated a genome-wide syntenic comparison of amaranth with other Amaranthaceae species, revealing chromosome loss and fusion events in amaranth that explain the reduction from the ancestral haploid chromosome number (n?=?18) for a tetraploid member of the Amaranthaceae.The assembly method reported here minimizes cost by relying primarily on short-read technology and is one of the first reported uses of in vivo Hi-C for assembly of a plant genome. Our analyses implicate chromosome loss and fusion as major evolutionary events in the 2n?=?32 amaranths and clearly establish the homoeologous relationship among most of the subgenome chromosomes, which will facilitate future investigations of intragenomic changes that occurred post polyploidization.


July 7, 2019

Complete genome sequence of a commensal bacterium, Hafnia alvei CBA7124, isolated from human feces.

Members of the genus Hafnia have been isolated from the feces of mammals, birds, reptiles, and fish, as well as from soil, water, sewage, and foods. Hafnia alvei is an opportunistic pathogen that has been implicated in intestinal and extraintestinal infections in humans. However, its pathogenicity is still unclear. In this study, we isolated H. alvei from human feces and performed sequencing as well as comparative genomic analysis to better understand its pathogenicity.The genome of H. alvei CBA7124 comprised a single circular chromosome with 4,585,298 bp and a GC content of 48.8%. The genome contained 25 rRNA genes (9 5S rRNA genes, 8 16S rRNA genes, and 8 23S rRNA genes), 88 tRNA genes, and 4043 protein-coding genes. Using comparative genomic analysis, the genome of this strain was found to have 72 strain-specific singletons. The genome also contained genes for antibiotic and antimicrobial resistance, as well as toxin-antitoxin systems.We revealed the complete genome sequence of the opportunistic gut pathogen, H. alvei CBA7124. We also performed comparative genomic analysis of the sequences in the genome of H. alvei CBA7124, and found that it contained strain-specific singletons, antibiotic resistance genes, and toxin-antitoxin systems. These results could improve our understanding of the pathogenicity and the mechanism behind the antibiotic resistance of H. alvei strains.


July 7, 2019

The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies.

Theobroma cacao L., native to the Amazonian basin of South America, is an economically important fruit tree crop for tropical countries as a source of chocolate. The first draft genome of the species, from a Criollo cultivar, was published in 2011. Although a useful resource, some improvements are possible, including identifying misassemblies, reducing the number of scaffolds and gaps, and anchoring un-anchored sequences to the 10 chromosomes.We used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined four Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions and reduced the number of scaffolds. We then used genotyping by sequencing (GBS) methods to increase the proportion of the assembly anchored to chromosomes.The scaffold number decreased from 4,792 in assembly V1 to 554 in V2 while the scaffold N50 size has increased from 0.47 Mb in V1 to 6.5 Mb in V2. A total of 96.7% of the assembly was anchored to the 10 chromosomes compared to 66.8% in the previous version. Unknown sites (Ns) were reduced from 10.8% to 5.7%. In addition, we updated the functional annotations and performed a new RefSeq structural annotation based on RNAseq evidence.Theobroma cacao Criollo genome version 2 will be a valuable resource for the investigation of complex traits at the genomic level and for future comparative genomics and genetics studies in cacao tree. New functional tools and annotations are available on the Cocoa Genome Hub ( http://cocoa-genome-hub.southgreen.fr ).


July 7, 2019

XCAVATOR: accurate detection and genotyping of copy number variants from second and third generation whole-genome sequencing experiments.

We developed a novel software package, XCAVATOR, for the identification of genomic regions involved in copy number variants/alterations (CNVs/CNAs) from short and long reads whole-genome sequencing experiments.By using simulated and real datasets we showed that our tool, based on read count approach, is capable to predict the boundaries and the absolute number of DNA copies CNVs/CNAs with high resolutions. To demonstrate the power of our software we applied it to the analysis Illumina and Pacific Bioscencies data and we compared its performance to other ten state of the art tools.All the analyses we performed demonstrate that XCAVATOR is capable to detect germline and somatic CNVs/CNAs outperforming all the other tools we compared. XCAVATOR is freely available at http://sourceforge.net/projects/xcavator/ .


July 7, 2019

The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology.

Mobile element insertions (MEIs) represent ~25% of all structural variants in human genomes. Moreover, when they disrupt genes, MEIs can influence human traits and diseases. Therefore, MEIs should be fully discovered along with other forms of genetic variation in whole genome sequencing (WGS) projects involving population genetics, human diseases, and clinical genomics. Here, we describe the Mobile Element Locator Tool (MELT), which was developed as part of the 1000 Genomes Project to perform MEI discovery on a population scale. Using both Illumina WGS data and simulations, we demonstrate that MELT outperforms existing MEI discovery tools in terms of speed, scalability, specificity, and sensitivity, while also detecting a broader spectrum of MEI-associated features. Several run modes were developed to perform MEI discovery on local and cloud systems. In addition to using MELT to discover MEIs in modern humans as part of the 1000 Genomes Project, we also used it to discover MEIs in chimpanzees and ancient (Neanderthal and Denisovan) hominids. We detected diverse patterns of MEI stratification across these populations that likely were caused by (1) diverse rates of MEI production from source elements, (2) diverse patterns of MEI inheritance, and (3) the introgression of ancient MEIs into modern human genomes. Overall, our study provides the most comprehensive map of MEIs to date spanning chimpanzees, ancient hominids, and modern humans and reveals new aspects of MEI biology in these lineages. We also demonstrate that MELT is a robust platform for MEI discovery and analysis in a variety of experimental settings.© 2017 Gardner et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Comparative genomics of maize ear rot pathogens reveals expansion of carbohydrate-active enzymes and secondary metabolism backbone genes in Stenocarpella maydis.

Stenocarpella maydis is a plant pathogenic fungus that causes Diplodia ear rot, one of the most destructive diseases of maize. To date, little information is available regarding the molecular basis of pathogenesis in this organism, in part due to limited genomic resources. In this study, a 54.8 Mb draft genome assembly of S. maydis was obtained with Illumina and PacBio sequencing technologies, and analyzed. Comparative genomic analyses with the predominant maize ear rot pathogens Aspergillus flavus, Fusarium verticillioides, and Fusarium graminearum revealed an expanded set of carbohydrate-active enzymes for cellulose and hemicellulose degradation in S. maydis. Analyses of predicted genes involved in starch degradation revealed six putative a-amylases, four extracellular and two intracellular, and two putative ?-amylases, one of which appears to have been acquired from bacteria via horizontal transfer. Additionally, 87 backbone genes involved in secondary metabolism were identified, which represents one of the largest known assemblages among Pezizomycotina species. Numerous secondary metabolite gene clusters were identified, including two clusters likely involved in the biosynthesis of diplodiatoxin and chaetoglobosins. The draft genome of S. maydis presented here will serve as a useful resource for molecular genetics, functional genomics, and analyses of population diversity in this organism. Copyright © 2017 British Mycological Society. Published by Elsevier Ltd. All rights reserved.


July 7, 2019

Aestuarium zhoushanense gen. nov., sp. nov., Isolated from the Tidal Flat.

A gram-stain-negative, aerobic, ovoid or short rod-shaped, and non-motile strain, designed G7T was isolated from a tidal flat sample collected from the coast of East Sea in Zhoushan, China. Strain G7T grew at 4-40 °C and pH 6.0-9.0 (optimum, 28 °C and pH 7.5) and with 0-7% (w/v) NaCl (optimum, 1%). The predominant respiratory quinone was Q-10 and the major fatty acids (>10%) identified were C18:1 ?7c, C16:0 and summed feature 3 (C16:1 ?7c and/or C16:1 ?6c). The polar lipids of strain G7T consisted of phosphatidylglycerol, phosphatidylethanolamine, phosphatidylcholine, and four unidentified lipids. The genomic DNA G+C content was 56.7 mol%. Phylogenetic analysis based on 16S rRNA gene sequences showed that strain G7T formed a distinct lineage belonging to the Roseobacter clade of the family Rhodobacteraceae. On the basis of morphological, physiological, and chemotaxonomic characteristics, together with the results of phylogenetic analysis, strain G7T is described as a novel species in a new genus, for which the name Aestuarium zhoushanense gen. nov., sp. nov. (type strain G7T = MCCC 1K03229T = KCTC 52584T) is proposed.


July 7, 2019

New insights into structural organization and gene duplication in a 1.75-Mb genomic region harboring the a-gliadin gene family in Aegilops tauschii, the source of wheat D genome.

Among the wheat prolamins important for its end-use traits, a-gliadins are the most abundant, and are also a major cause of food-related allergies and intolerances. Previous studies of various wheat species estimated that between 25 and 150 a-gliadin genes reside in the Gli-2 locus regions. To better understand the evolution of this complex gene family, the DNA sequence of a 1.75-Mb genomic region spanning the Gli-2 locus was analyzed in the diploid grass, Aegilops tauschii, the ancestral source of D genome in hexaploid bread wheat. Comparison with orthologous regions from rice, sorghum, and Brachypodium revealed rapid and dynamic changes only occurring to the Ae. tauschii Gli-2 region, including insertions of high numbers of non-syntenic genes and a high rate of tandem gene duplications, the latter of which have given rise to 12 copies of a-gliadin genes clustered within a 550-kb region. Among them, five copies have undergone pseudogenization by various mutation events. Insights into the evolutionary relationship of the duplicated a-gliadin genes were obtained from their genomic organization, transcription patterns, transposable element insertions and phylogenetic analyses. An ancestral glutamate-like receptor (GLR) gene encoding putative amino acid sensor in all four grass species has duplicated only in Ae. tauschii and generated three more copies that are interspersed with the a-gliadin genes. Phylogenetic inference and different gene expression patterns support functional divergence of the Ae. tauschii GLR copies after duplication. Our results suggest that the duplicates of a-gliadin and GLR genes have likely taken different evolutionary paths; conservation for the former and neofunctionalization for the latter.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.


July 7, 2019

Complete genome sequence of Mesorhizobium sophorae ICMP 19535T, a highly specific, nitrogen-fixing symbiont of New Zealand endemic Sophora spp.

We report here the complete genome sequence of Mesorhizobium sophorae ICMP 19535(T) This strain was isolated from Sophora microphylla root nodules and can nodulate and fix nitrogen with this host and also with Sophora prostrata, Sophora longicarinata, and Clianthus puniceus The genome consists of 8.05 Mb. Copyright © 2017 De Meyer et al.


July 7, 2019

Whole-genome assembly of Babesia ovata and comparative genomics between closely related pathogens.

Babesia ovata, belonging to the phylum Apicomplexa, is an infectious parasite of bovids. It is not associated with the manifestation of severe symptoms, in contrast to other types of bovine babesiosis caused by B. bovis and B. bigemina; however, upon co-infection with Theileria orientalis, it occasionally induces exacerbated symptoms. Asymptomatic chronic infection in bovines is usually observed only for B. ovata. Comparative genomic analysis could potentially reveal factors involved in these distinguishing characteristics; however, the genomic and molecular basis of these phenotypes remains elusive, especially in B. ovata. From a technical perspective, the current development of a very long read sequencer, MinION, will facilitate the obtainment of highly integrated genome sequences. Therefore, we applied next-generation sequencing to acquire a high-quality genome of the parasite, which provides fundamental information for understanding apicomplexans.The genome was assembled into 14,453,397 bp in size with 5031 protein-coding sequences (91 contigs and N50 = 2,090,503 bp). Gene family analysis revealed that ves1 alpha and beta, which belong to multigene families in B. bovis, were absent from B. ovata, the same as in B. bigemina. Instead, ves1a and ves1b, which were originally specified in B. bigemina, were present. The B. ovata and B. bigemina ves1a configure one cluster together even though they divided into two sub-clusters according to the spp. In contrast, the ves1b cluster was more dispersed and the overlap among B. ovata and B. bigemina was limited. The observed redundancy and rapid evolution in sequence might reflect the adaptive history of these parasites. Moreover, same candidate genes which potentially involved in the distinct phenotypes were specified by functional analysis. An anamorsin homolog is one of them. The human anamorsin is involved in hematopoiesis and the homolog was present in B. ovata but absent in B. bigemina which causes severe anemia.Taking these findings together, the differences demonstrated by comparative genomics potentially explain the evolutionary history of these parasites and the differences in their phenotypes. Besides, the draft genome provides fundamental information for further characterization and understanding of these parasites.


July 7, 2019

Public health surveillance in the UK revolutionises our understanding of the invasive Salmonella Typhimurium epidemic in Africa.

The ST313 sequence type of Salmonella Typhimurium causes invasive non-typhoidal salmonellosis and was thought to be confined to sub-Saharan Africa. Two distinct phylogenetic lineages of African ST313 have been identified.We analysed the whole genome sequences of S. Typhimurium isolates from UK patients that were generated following the introduction of routine whole-genome sequencing (WGS) of Salmonella enterica by Public Health England in 2014.We found that 2.7% (84/3147) of S. Typhimurium from patients in England and Wales were ST313 and were associated with gastrointestinal infection. Phylogenetic analysis revealed novel diversity of ST313 that distinguished UK-linked gastrointestinal isolates from African-associated extra-intestinal isolates. The majority of genome degradation of African ST313 lineage 2 was conserved in the UK-ST313, but the African lineages carried a characteristic prophage and antibiotic resistance gene repertoire. These findings suggest that a strong selection pressure exists for certain horizontally acquired genetic elements in the African setting. One UK-isolated lineage 2 strain that probably originated in Kenya carried a chromosomally located bla CTX-M-15, demonstrating the continual evolution of this sequence type in Africa in response to widespread antibiotic usage.The discovery of ST313 isolates responsible for gastroenteritis in the UK reveals new diversity in this important sequence type. This study highlights the power of routine WGS by public health agencies to make epidemiologically significant deductions that would be missed by conventional microbiological methods. We speculate that the niche specialisation of sub-Saharan African ST313 lineages is driven in part by the acquisition of accessory genome elements.


July 7, 2019

Dense and accurate whole-chromosome haplotyping of individual genomes.

The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate chromosome-length haplotypes at reasonable costs. Here we introduce an integrative phasing strategy that combines global, but sparse haplotypes obtained from strand-specific single-cell sequencing (Strand-seq) with dense, yet local, haplotype information available through long-read or linked-read sequencing. We provide comprehensive guidance on the required sequencing depths and reliably assign more than 95% of alleles (NA12878) to their parental haplotypes using as few as 10 Strand-seq libraries in combination with 10-fold coverage PacBio data or, alternatively, 10X Genomics linked-read sequencing data. We conclude that the combination of Strand-seq with different technologies represents an attractive solution to chart the genetic variation of diploid genomes.


July 7, 2019

Hybrid de novo genome assembly and centromere characterization of the gray mouse lemur (Microcebus murinus).

The de novo assembly of repeat-rich mammalian genomes using only high-throughput short read sequencing data typically results in highly fragmented genome assemblies that limit downstream applications. Here, we present an iterative approach to hybrid de novo genome assembly that incorporates datasets stemming from multiple genomic technologies and methods. We used this approach to improve the gray mouse lemur (Microcebus murinus) genome from early draft status to a near chromosome-scale assembly.We used a combination of advanced genomic technologies to iteratively resolve conflicts and super-scaffold the M. murinus genome.We improved the M. murinus genome assembly to a scaffold N50 of 93.32 Mb. Whole genome alignments between our primary super-scaffolds and 23 human chromosomes revealed patterns that are congruent with historical comparative cytogenetic data, thus demonstrating the accuracy of our de novo scaffolding approach and allowing assignment of scaffolds to M. murinus chromosomes. Moreover, we utilized our independent datasets to discover and characterize sequences associated with centromeres across the mouse lemur genome. Quality assessment of the final assembly found 96% of mouse lemur canonical transcripts nearly complete, comparable to other published high-quality reference genome assemblies.We describe a new assembly of the gray mouse lemur (Microcebus murinus) genome with chromosome-scale scaffolds produced using a hybrid bioinformatic and sequencing approach. The approach is cost effective and produces superior results based on metrics of contiguity and completeness. Our results show that emerging genomic technologies can be used in combination to characterize centromeres of non-model species and to produce accurate de novo chromosome-scale genome assemblies of complex mammalian genomes.


July 7, 2019

Complete genome sequence of Streptococcus thermophilus strain B59671, which naturally produces the broad-spectrum bacteriocin thermophilin 110.

Streptococcus thermophilus strain B59671 is a Gram-positive lactic acid bacterium that naturally produces a broad-spectrum bacteriocin, thermophilin 110, and is capable of producing gamma-aminobutyric acid (GABA). The complete genome sequence for this strain contains 1,821,173 nucleotides, 1,936 predicted genes, and an average G+C content of 39.1%.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.