Menu
September 22, 2019  |  

PBHoover and CigarRoller: a method for confident haploid variant calling on Pacific Biosciences data and its application to heterogeneous population analysis

Motivation: Single Molecule Real-Time (SMRT) sequencing has important and underutilized advantages that amplification-based platforms lack. Lack of systematic error (e.g. GC-bias), complete de novo assembly (including large repetitive regions) without scaffolding, can be mentioned. SMRT sequencing, however suffers from high random error rate and low sequencing depth (older chemistries). Here, we introduce PBHoover, software that uses a heuristic calling algorithm in order to make base calls with high certainty in low coverage regions. This software is also capable of mixed population detection with high sensitivity. PBHoovertextquoterights CigarRoller attachment improves sequencing depth in low-coverage regions through CIGAR-string correction. Results: We tested both modules on 348 M.tuberculosis clinical isolates sequenced on C1 or C2 chemistries. On average, CigarRoller improved percentage of usable read count from 68.9% to 99.98% in C1 runs and from 50% to 99% in C2 runs. Using the greater depth provided by CigarRoller, PBHoover was able to make base and variant calls 99.95% concordant with Sanger calls (QV33). PBHoover also detected antibiotic-resistant subpopulations that went undetected by Sanger. Using C1 chemistry, subpopulations as small as 9% of the total colony can be detected by PBHoover. This provides the most sensitive amplification-free molecular method for heterogeneity analysis and is in line with phenotypic methodstextquoteright sensitivity. This sensitivity significantly improves with the greater depth and lower error rate of the newer chemistries. Availability and Implementation: Executables are freely available under GNU GPL v3+ at http://www.gitlab.com/LPCDRP/pbhoover and http://www.gitlab.com/LPCDRP/CigarRoller. PBHoover is also available on bioconda: https://anaconda.org/bioconda/pbhoover.


September 22, 2019  |  

Complete sequence of kenaf (Hibiscus cannabinus) mitochondrial genome and comparative analysis with the mitochondrial genomes of other plants.

Plant mitochondrial (mt) genomes are species specific due to the vast of foreign DNA migration and frequent recombination of repeated sequences. Sequencing of the mt genome of kenaf (Hibiscus cannabinus) is essential for elucidating its evolutionary characteristics. In the present study, single-molecule real-time sequencing technology (SMRT) was used to sequence the complete mt genome of kenaf. Results showed that the complete kenaf mt genome was 569,915?bp long and consisted of 62 genes, including 36 protein-coding, 3 rRNA and 23 tRNA genes. Twenty-five introns were found among nine of the 36 protein-coding genes, and five introns were trans-spliced. A comparative analysis with other plant mt genomes showed that four syntenic gene clusters were conserved in all plant mtDNAs. Fifteen chloroplast-derived fragments were strongly associated with mt genes, including the intact sequences of the chloroplast genes psaA, ndhB and rps7. According to the plant mt genome evolution analysis, some ribosomal protein genes and succinate dehydrogenase genes were frequently lost during the evolution of angiosperms. Our data suggest that the kenaf mt genome retained evolutionarily conserved characteristics. Overall, the complete sequencing of the kenaf mt genome provides additional information and enhances our better understanding of mt genomic evolution across angiosperms.


September 22, 2019  |  

A gene-rich fraction analysis of the Passiflora edulis genome reveals highly conserved microsyntenic regions with two related Malpighiales species.

Passiflora edulis is the most widely cultivated species of passionflowers, cropped mainly for industrialized juice production and fresh fruit consumption. Despite its commercial importance, little is known about the genome structure of P. edulis. To fill in this gap in our knowledge, a genomic library was built, and now completely sequenced over 100 large-inserts. Sequencing data were assembled from long sequence reads, and structural sequence annotation resulted in the prediction of about 1,900 genes, providing data for subsequent functional analysis. The richness of repetitive elements was also evaluated. Microsyntenic regions of P. edulis common to Populus trichocarpa and Manihot esculenta, two related Malpighiales species with available fully sequenced genomes were examined. Overall, gene order was well conserved, with some disruptions of collinearity identified as rearrangements, such as inversion and translocation events. The microsynteny level observed between the P. edulis sequences and the compared genomes is surprising, given the long divergence time that separates them from the common ancestor. P. edulis gene-rich segments are more compact than those of the other two species, even though its genome is much larger. This study provides a first accurate gene set for P. edulis, opening the way for new studies on the evolutionary issues in Malpighiales genomes.


September 22, 2019  |  

Orphan legumes growing in dry environments: Marama bean as a case study.

Plants have developed morphological, physiological, biochemical, cellular, and molecular mechanisms to survive in drought-stricken environments with little or no water caused by below-average precipitation. In this mini-review, we highlight the characteristics that allows marama bean [Tylosema esculentum (Burchell) Schreiber], an example of an orphan legume native to arid regions of southwestern Southern Africa, to flourish under an inhospitable climate and dry soil conditions where no other agricultural crop competes in this agro-ecological zone. Orphan legumes are often better suited to withstand such harsh growth environments due to development of survival strategies using a combination of different traits and responses. Recent findings on questions on marama bean speciation, hybridization, population dynamics, and the evolutionary history of the bean and mechanisms by which the bean is able to extract and conserve water and nutrients from its environment as well as aspects of morphological and physiological adaptation will be reviewed. The importance of the soil microbiome and the genetic diversity in this species, and their interplay, as a reservoir for improvement will also be considered. In particular, the application of the newly established marama bean genome sequence will facilitate both the identification of important genes involved in the interaction with the soil microbiome and the identification of the diversity within the wild germplasm for genes involved drought tolerance. Since predicted future changes in climatic conditions, with less water availability for plant growth, will severely affect agricultural productivity, an understanding of the mechanisms of unique adaptations in marama bean to such conditions may also provide insights as to how to improve the performance of the major crops.


September 22, 2019  |  

Genome analyses of the microalga Picochlorum provide insights into the evolution of thermotolerance in the green lineage.

While the molecular events involved in cell responses to heat stress have been extensively studied, our understanding of the genetic basis of basal thermotolerance, and particularly its evolution within the green lineage, remains limited. Here, we present the 13.3-Mb haploid genome and transcriptomes of a halotolerant and thermotolerant unicellular green alga, Picochlorum costavermella (Trebouxiophyceae) to investigate the evolution of the genomic basis of thermotolerance. Differential gene expression at high and standard temperatures revealed that more of the gene families containing up-regulated genes at high temperature were recently evolved, and less originated at the ancestor of green plants. Inversely, there was an excess of ancient gene families containing transcriptionally repressed genes. Interestingly, there is a striking overlap between the thermotolerance and halotolerance transcriptional rewiring, as more than one-third of the gene families up-regulated at 35?°C were also up-regulated under variable salt concentrations in Picochlorum SE3. Moreover, phylogenetic analysis of the 9,304 protein coding genes revealed 26 genes of horizontally transferred origin in P. costavermella, of which five were differentially expressed at higher temperature. Altogether, these results provide new insights about how the genomic basis of adaptation to halo- and thermotolerance evolved in the green lineage.


September 22, 2019  |  

Draft genome sequence of wild Prunus yedoensis reveals massive inter-specific hybridization between sympatric flowering cherries.

Hybridization is an important evolutionary process that results in increased plant diversity. Flowering Prunus includes popular cherry species that are appreciated worldwide for their flowers. The ornamental characteristics were acquired both naturally and through artificially hybridizing species with heterozygous genomes. Therefore, the genome of hybrid flowering Prunus presents important challenges both in plant genomics and evolutionary biology.We use long reads to sequence and analyze the highly heterozygous genome of wild Prunus yedoensis. The genome assembly covers >?93% of the gene space; annotation identified 41,294 protein-coding genes. Comparative analysis of the genome with 16 accessions of six related taxa shows that 41% of the genes were assigned into the maternal or paternal state. This indicates that wild P. yedoensis is an F1 hybrid originating from a cross between maternal P. pendula f. ascendens and paternal P. jamasakura, and it can be clearly distinguished from its confusing taxon, Yoshino cherry. A focused analysis of the S-locus haplotypes of closely related taxa distributed in a sympatric natural habitat suggests that reduced restriction of inter-specific hybridization due to strong gametophytic self-incompatibility is likely to promote complex hybridization of wild Prunus species and the development of a hybrid swarm.We report the draft genome assembly of a natural hybrid Prunus species using long-read sequencing and sequence phasing. Based on a comprehensive comparative genome analysis with related taxa, it appears that cross-species hybridization in sympatric habitats is an ongoing process that facilitates the diversification of flowering Prunus.


September 22, 2019  |  

Exploring the limits and causes of plastid genome expansion in volvocine green algae.

Plastid genomes are not normally celebrated for being large. But researchers are steadily uncovering algal lineages with big and, in rare cases, enormous plastid DNAs (ptDNAs), such as volvocine green algae. Plastome sequencing of five different volvocine species has revealed some of the largest, most repeat-dense plastomes on record, including that of Volvox carteri (~525?kb). Volvocine algae have also been used as models for testing leading hypotheses on organelle genome evolution (e.g., the mutational hazard hypothesis), and it has been suggested that ptDNA inflation within this group might be a consequence of low mutation rates and/or the transition from a unicellular to multicellular existence. Here, we further our understanding of plastome size variation in the volvocine line by examining the ptDNA sequences of the colonial species Yamagishiella unicocca and Eudorina sp. NIES-3984 and the multicellular Volvox africanus, which are phylogenetically situated between species with known ptDNA sizes. Although V. africanus is closely related and similar in multicellular organization to V. carteri, its ptDNA was much less inflated than that of V. carteri. Synonymous- and noncoding-site nucleotide substitution rate analyses of these two Volvox ptDNAs suggest that there are drastically different plastid mutation rates operating in the coding versus intergenic regions, supporting the idea that error-prone DNA repair in repeat-rich intergenic spacers is contributing to genome expansion. Our results reinforce the idea that the volvocine line harbors extremes in plastome size but ultimately shed doubt on some of the previously proposed hypotheses for ptDNA inflation within the lineage.


September 22, 2019  |  

Comparison of the mitochondrial genome sequences of six Annulohypoxylon stygium isolates suggests short fragment insertions as a potential factor leading to larger genomic size.

Mitochondrial DNA (mtDNA) is a core non-nuclear genetic material found in all eukaryotic organisms, the size of which varies extensively in the eumycota, even within species. In this study, mitochondrial genomes of six isolates of Annulohypoxylon stygium (Lév.) were assembled from raw reads from PacBio and Illumina sequencing. The diversity of genomic structures, conserved genes, intergenic regions and introns were analyzed and compared. Genome sizes ranged from 132 to 147 kb and contained the same sets of conserved protein-coding, tRNA and rRNA genes and shared the same gene arrangements and orientation. In addition, most intergenic regions were homogeneous and had similar sizes except for the region between cytochrome b (cob) and cytochrome c oxidase I (cox1) genes which ranged from 2,998 to 8,039 bp among the six isolates. Sixty-five intron insertion sites and 99 different introns were detected in these genomes. Each genome contained 45 or more introns, which varied in distribution and content. Introns from homologous insertion sites also showed high diversity in size, type and content. Comparison of introns at the same loci showed some complex introns, such as twintrons and ORF-less introns. There were 44 short fragment insertions detected within introns, intergenic regions, or as introns, some of them located at conserved domain regions of homing endonuclease genes. Insertions of short fragments such as small inverted repeats might affect or hinder the movement of introns, and these allowed for intron accumulation in the mitochondrial genomes analyzed, and enlarged their size. This study showed that the evolution of fungal mitochondrial introns is complex, and the results suggest short fragment insertions as a potential factor leading to larger mitochondrial genomes in A. stygium.


September 22, 2019  |  

Genome-wide researches and applications on Dendrobium.

This review summarizes current knowledge of chromosome characterization, genetic mapping, genomic sequencing, quality formation, floral transition, propagation, and identification in Dendrobium. The widely distributed Dendrobium has been studied for a long history, due to its important economic values in both medicine and ornamental. In recent years, some species of Dendrobium and other orchids had been reported on genomic sequences, using the next-generation sequencing technology. And the chloroplast genomes of many Dendrobium species were also revealed. The chromosomes of most Dendrobium species belong to mini-chromosomes, and showed 2n?=?38. Only a few of genetic studies were reported in Dendrobium. After revealing of genomic sequences, the techniques of transcriptomics, proteomics and metabolomics could be employed on Dendrobium easily. Some other molecular biological techniques, such as gene cloning, gene editing, genetic transformation and molecular marker developing, had also been applied on the basic research of Dendrobium, successively. As medicinal plants, insights into the biosynthesis of some medicinal components were the most important. As ornamental plants, regulation of flower related characteristics was the most important. More, knowledge of growth and development, environmental interaction, evolutionary analysis, breeding of new cultivars, propagation, and identification of species and herbs were also required for commercial usage. All of these studies were improved using genomic sequences and related technologies. To answer some key scientific issues in Dendrobium, quality formation, flowering, self-incompatibility and seed germination would be the focus of future research. And genome related technologies and studies would be helpful.


September 22, 2019  |  

Haematococcus lacustris: the makings of a giant-sized chloroplast genome.

Recent work on the chlamydomonadalean green alga Haematococcus lacustris uncovered the largest plastid genome on record: a whopping 1.35 Mb with >90 % non-coding DNA. A 500-word description of this genome was published in the journal Genome Announcements. But such a short report for such a large genome leaves many unanswered questions. For instance, the H. lacustris plastome was found to encode only 12 tRNAs, less than half that of a typical plastome, it appears to have a non-standard genetic code, and is one of only a few known plastid DNAs (ptDNAs), out of thousands of available sequences, not biased in adenine and thymine. Here, I take a closer look at the H. lacustris plastome, comparing its size, content and architecture to other large organelle DNAs, including those from close relatives in the Chlamydomonadales. I show that the H. lacustris plastid coding repertoire is not as unusual as initially thought, representing a standard set of rRNAs, tRNAs and protein-coding genes, where the canonical stop codon UGA appears to sometimes signify tryptophan. The intergenic spacers are dense with repeats, and it is within these regions where potential answers to the source of such extreme genomic expansion lie. By comparing ptDNA sequences of two closely related strains of H. lacustris, I argue that the mutation rate of the non-coding DNA is high and contributing to plastome inflation. Finally, by exploring publicly available RNA-sequencing data, I find that most of the intergenic ptDNA is transcriptionally active.


September 22, 2019  |  

The complete chloroplast genome sequence of Coix lacryma-jobi L.(Poaceae), a cereal and medicinal crop

Coix lacryma-jobi is a cereal and medicinal crop belonging to the Poaceae family. This study characterized complete chloroplast genome sequence of a Korean cultivar Johyun of C. lacryma-jobi var. ma-yuen through the de novo hybrid assembly with Illumina and PacBio genomic reads. The chloroplast genome is 140,863?bp long and composed of large single copy (82,827?bp), small single copy (12,522?bp), and a pair of inverted repeats (each 22,757?bp). A total of 123 genes including 87 protein-coding genes, 32 tRNA genes, and four rRNA genes were predicted in the genome. Phylogenetic analysis confirmed a close relationship of C. lacryma-jobi with species in the Panicoideae subfamily of the Poaceae family.


September 22, 2019  |  

Isolation, characterization, genomic sequencing, and GFP-marked insertional mutagenesis of a high-performance nitrogen-fixing bacterium, Kosakonia radicincitans GXGL-4A and visualization of bacterial colonization on cucumber roots.

A gram-negative bacterium GXGL-4A was originally isolated from maize roots. It displayed nitrogen-fixing (NF) ability under nitrogen-free culture condition, and had a significant promotion effect on cucumber growth in the pot inoculation test. The preliminary physiological and biochemical traits of GXGL-4A were characterized. Furthermore, a phylogenetic tree was constructed based on 16S ribosomal DNA (rDNA) sequences of genetically related species. To determine the taxonomic status of GXGL-4A and further utilize its nitrogen-fixing potential, genome sequence was obtained using PacBio RS II technology. The analyses of average nucleotide identity based on BLAST+ (ANIb) and correlation indexes of tetra-nucleotide signatures (Tetra) showed that the NF isolate GXGL-4A is closely related to the Kosakonia radicincitans type strain DSM 16656. Therefore, the isolate GXGL-4A was eventually classified into the species of Kosakonia radicincitans and designated K. radicincitans GXGL-4A. A high consistency in composition and gene arrangement of nitrogen-fixing gene cluster I (nif cluster I) was found between K. radicincitans GXGL-4A and other Kosakonia NF strains. The mutants tagged with green fluorescence protein (GFP) were obtained by transposon Tn5 mutagenesis, and then, the colonization of gfp-marked K. radicincitans GXGL-4A cells on cucumber seedling root were observed under fluorescence microscopy. The preferential sites of the labeled GXGL-4A cell population were the lateral root junctions, the differentiation zone, and the elongation zone. All these results should benefit for the deep exploration of nitrogen fixation mechanism of K. radicincitans GXGL-4A and will definitely facilitate the genetic modification process of this NF bacterium in sustainable agriculture.


September 22, 2019  |  

Genomic characterization reveals significant divergence within Chlorella sorokiniana (Chlorellales, Trebouxiophyceae)

Selection of highly productive algal strains is crucial for establishing economically viable biomass and biopro- duct cultivation systems. Characterization of algal genomes, including understanding strain-specific differences in genome content and architecture is a critical step in this process. Using genomic analyses, we demonstrate significant differences between three strains of Chlorella sorokiniana (strain 1228, UTEX 1230, and DOE1412). We found that unique, strain-specific genes comprise a substantial proportion of each genome, and genomic regions with> 80% local nucleotide identity constitute <15% of each genome among the strains, indicating substantial strain specific evolution. Furthermore, cataloging of meiosis and other sex-related genes in C. sor- okiniana strains suggests strategic breeding could be utilized to improve biomass and bioproduct yields if a sexual cycle can be characterized. Finally, preliminary investigation of epigenetic machinery suggests the pre- sence of potentially unique transcriptional regulation in each strain. Our data demonstrate that these three C. sorokiniana strains represent significantly different genomic content. Based on these findings, we propose in- dividualized assessment of each strain for potential performance in cultivation systems.


September 22, 2019  |  

Genomic analysis of Picochlorum species reveals how microalgae may adapt to variable environments.

Understanding how microalgae adapt to rapidly changing environments is not only important to science but can help clarify the potential impact of climate change on the biology of primary producers. We sequenced and analyzed the nuclear genome of multiple Picochlorum isolates (Chlorophyta) to elucidate strategies of environmental adaptation. It was previously found that coordinated gene regulation is involved in adaptation to salinity stress, and here we show that gene gain and loss also play key roles in adaptation. We determined the extent of horizontal gene transfer (HGT) from prokaryotes and their role in the origin of novel functions in the Picochlorum clade. HGT is an ongoing and dynamic process in this algal clade with adaptation being driven by transfer, divergence, and loss. One HGT candidate that is differentially expressed under salinity stress is indolepyruvate decarboxylase that is involved in the production of a plant auxin that mediates bacteria-diatom symbiotic interactions. Large differences in levels of heterozygosity were found in diploid haplotypes among Picochlorum isolates. Biallelic divergence was pronounced in P. oklahomensis (salt plains environment) when compared with its closely related sister taxon Picochlorum SENEW3 (brackish water environment), suggesting a role of diverged alleles in response to environmental stress. Our results elucidate how microbial eukaryotes with limited gene inventories expand habitat range from mesophilic to halophilic through allelic diversity, and with minor but important contributions made by HGT. We also explore how the nature and quality of genome data may impact inference of nuclear ploidy.


September 22, 2019  |  

The impact of genome evolution on the allotetraploid Nicotiana rustica – an intriguing story of enhanced alkaloid production.

Nicotiana rustica (Aztec tobacco), like common tobacco (Nicotiana tabacum), is an allotetraploid formed through a recent hybridization event; however, it originated from completely different progenitor species. Here, we report the comparative genome analysis of wild type N. rustica (5 Gb; 2n?=?4x?=?48) with its three putative diploid progenitors (2.3-3 Gb; 2n?=?2x =24), Nicotiana undulata, Nicotiana paniculata and Nicotiana knightiana.In total, 41% of N. rustica genome originated from the paternal donor (N. undulata), while 59% originated from the maternal donor (N. paniculata/N. knightiana). Chloroplast genome and gene analyses indicated that N. knightiana is more closely related to N. rustica than N. paniculata. Gene clustering revealed 14,623 ortholog groups common to other Nicotiana species and 207 unique to N. rustica. Genome sequence analysis indicated that N. knightiana is more closely related to N. rustica than N. paniculata, and that the higher nicotine content of N. rustica leaves is the result of the progenitor genomes combination and of a more active transport of nicotine to the shoot.The availability of four new Nicotiana genome sequences provide insights into how speciation impacts plant metabolism, and in particular alkaloid transport and accumulation, and will contribute to better understanding the evolution of Nicotiana species.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.