Menu
July 7, 2019

Insights into land plant evolution garnered from the Marchantia polymorpha genome.

The evolution of land flora transformed the terrestrial environment. Land plants evolved from an ancestral charophycean alga from which they inherited developmental, biochemical, and cell biological attributes. Additional biochemical and physiological adaptations to land, and a life cycle with an alternation between multicellular haploid and diploid generations that facilitated efficient dispersal of desiccation tolerant spores, evolved in the ancestral land plant. We analyzed the genome of the liverwort Marchantia polymorpha, a member of a basal land plant lineage. Relative to charophycean algae, land plant genomes are characterized by genes encoding novel biochemical pathways, new phytohormone signaling pathways (notably auxin), expanded repertoires of signaling pathways, and increased diversity in some transcription factor families. Compared with other sequenced land plants, M. polymorpha exhibits low genetic redundancy in most regulatory pathways, with this portion of its genome resembling that predicted for the ancestral land plant. PAPERCLIP. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.


July 7, 2019

Genome sequence of the small brown planthopper, Laodelphax striatellus.

Laodelphax striatellus Fallén (Hemiptera: Delphacidae) is one of the most destructive rice pests. L. striatellus is different from 2 other rice planthoppers with a released genome sequence, Sogatella furcifera and Nilaparvata lugens, in many biological characteristics, such as host range, dispersal capacity, and vectoring plant viruses. Deciphering the genome of L. striatellus will further the understanding of the genetic basis of the biological differences among the 3 rice planthoppers.A total of 190 Gb of Illumina data and 32.4 Gb of Pacbio data were generated and used to assemble a high-quality L. striatellus genome sequence, which is 541 Mb in length and has a contig N50 of 118 Kb and a scaffold N50 of 1.08 Mb. Annotated repetitive elements account for 25.7% of the genome. A total of 17?736 protein-coding genes were annotated, capturing 97.6% and 98% of the BUSCO eukaryote and arthropoda genes, respectively. Compared with N. lugens and S. furcifera, L. striatellus has the smallest genome and the lowest gene number. Gene family expansion and transcriptomic analyses provided hints to the genomic basis of the differences in important traits such as host range, migratory habit, and plant virus transmission between L. striatellus and the other 2 planthoppers.We report a high-quality genome assembly of L. striatellus, which is an important genomic resource not only for the study of the biology of L. striatellus and its interactions with plant hosts and plant viruses, but also for comparison with other planthoppers.© The Authors 2017. Published by Oxford University Press.


July 7, 2019

Identification of sRNA mediated responses to nutrient depletion in Burkholderia pseudomallei.

The Burkholderia genus includes many species that are known to survive in diverse environmental conditions including low nutrient environments. One species, Burkholderia pseudomallei is a versatile pathogen that can survive in a wide range of hosts and environmental conditions. In this study, we investigated how a nutrient depleted growth environment evokes sRNA mediated responses by B. pseudomallei. Computationally predicted B. pseudomallei D286 sRNAs were mapped to RNA-sequencing data for cultures grown under two conditions: (1) BHIB as a nutrient rich media reference environment and (2) M9 media as a nutrient depleted stress environment. The sRNAs were further selected to identify potentially cis-encoded systems by investigating their possible interactions with their flanking genes. The mappings of predicted sRNA genes and interactions analysis to their flanking genes identified 12 sRNA candidates that may possibly have cis-acting regulatory roles that are associated to a nutrient depleted growth environment. Our approach can be used for identifying novel sRNA genes and their possible role as cis-mediated regulatory systems.


July 7, 2019

The genome sequence of Bipolaris cookei reveals mechanisms of pathogenesis underlying target leaf spot of sorghum.

Bipolaris cookei (=Bipolaris sorghicola) causes target leaf spot, one of the most prevalent foliar diseases of sorghum. Little is known about the molecular basis of pathogenesis in B. cookei, in large part due to a paucity of resources for molecular genetics, such as a reference genome. Here, a draft genome sequence of B. cookei was obtained and analyzed. A hybrid assembly strategy utilizing Illumina and Pacific Biosciences sequencing technologies produced a draft nuclear genome of 36.1?Mb, organized into 321 scaffolds with L50 of 31 and N50 of 378?kb, from which 11,189 genes were predicted. Additionally, a finished mitochondrial genome sequence of 135,790?bp was obtained, which contained 75 predicted genes. Comparative genomics revealed that B. cookei possessed substantially fewer carbohydrate-active enzymes and secreted proteins than closely related Bipolaris species. Novel genes involved in secondary metabolism, including genes implicated in ophiobolin biosynthesis, were identified. Among 37 B. cookei genes induced during sorghum infection, one encodes a putative effector with a limited taxonomic distribution among plant pathogenic fungi. The draft genome sequence of B. cookei provided novel insights into target leaf spot of sorghum and is an important resource for future investigation.


July 7, 2019

Comparative transcriptome profiling of virulent and non-virulent Trypanosoma cruzi underlines the role of surface proteins during infection.

Trypanosoma cruzi, the protozoan that causes Chagas disease, has a complex life cycle involving several morphologically and biochemically distinct stages that establish intricate interactions with various insect and mammalian hosts. It has also a heterogeneous population structure comprising strains with distinct properties such as virulence, sensitivity to drugs, antigenic profile and tissue tropism. We present a comparative transcriptome analysis of two cloned T. cruzi strains that display contrasting virulence phenotypes in animal models of infection: CL Brener is a virulent clone and CL-14 is a clone that is neither infective nor pathogenic in in vivo models of infection. Gene expression analysis of trypomastigotes and intracellular amastigotes harvested at 60 and 96 hours post-infection (hpi) of human fibroblasts revealed large differences that reflect the parasite’s adaptation to distinct environments during the infection of mammalian cells, including changes in energy sources, oxidative stress responses, cell cycle control and cell surface components. While extensive transcriptome remodeling was observed when trypomastigotes of both strains were compared to 60 hpi amastigotes, differences in gene expression were much less pronounced when 96 hpi amastigotes and trypomastigotes of CL Brener were compared. In contrast, the differentiation of the avirulent CL-14 from 96 hpi amastigotes to extracellular trypomastigotes was associated with considerable changes in gene expression, particularly in gene families encoding surface proteins such as trans-sialidases, mucins and the mucin associated surface proteins (MASPs). Thus, our comparative transcriptome analysis indicates that the avirulent phenotype of CL-14 may be due, at least in part, to a reduced or delayed expression of genes encoding surface proteins that are associated with the transition of amastigotes to trypomastigotes, an essential step in the establishment of the infection in the mammalian host. Confirming the role of members of the trans-sialidase family of surface proteins for parasite differentiation, transfected CL-14 constitutively expressing a trans-sialidase gene displayed faster kinetics of trypomastigote release in the supernatant of infected cells compared to wild type CL-14.


July 7, 2019

RNA-seq and Tn-seq reveal fitness determinants of vancomycin-resistant Enterococcus faecium during growth in human serum.

The Gram-positive bacterium Enterococcus faecium is a commensal of the human gastrointestinal tract and a frequent cause of bloodstream infections in hospitalized patients. The mechanisms by which E. faecium can survive and grow in blood during an infection have not yet been characterized. Here, we identify genes that contribute to growth of E. faecium in human serum through transcriptome profiling (RNA-seq) and a high-throughput transposon mutant library sequencing approach (Tn-seq).We first sequenced the genome of E. faecium E745, a vancomycin-resistant clinical isolate, using a combination of short- and long read sequencing, revealing a 2,765,010 nt chromosome and 6 plasmids, with sizes ranging between 9.3 kbp and 223.7 kbp. We then compared the transcriptome of E. faecium E745 during exponential growth in rich medium and in human serum by RNA-seq. This analysis revealed that 27.8% of genes on the E. faecium E745 genome were differentially expressed in these two conditions. A gene cluster with a role in purine biosynthesis was among the most upregulated genes in E. faecium E745 upon growth in serum. The E. faecium E745 transposon mutant library was then used to identify genes that were specifically required for growth of E. faecium in serum. Genes involved in de novo nucleotide biosynthesis (including pyrK_2, pyrF, purD, purH) and a gene encoding a phosphotransferase system subunit (manY_2) were thus identified to be contributing to E. faecium growth in human serum. Transposon mutants in pyrK_2, pyrF, purD, purH and manY_2 were isolated from the library and their impaired growth in human serum was confirmed. In addition, the pyrK_2 and manY_2 mutants were tested for their virulence in an intravenous zebrafish infection model and exhibited significantly attenuated virulence compared to E. faecium E745.Genes involved in carbohydrate metabolism and nucleotide biosynthesis of E. faecium are essential for growth in human serum and contribute to the pathogenesis of this organism. These genes may serve as targets for the development of novel anti-infectives for the treatment of E. faecium bloodstream infections.


July 7, 2019

An update on bioinformatics resources for plant genomics research

Next-generation sequencing and traditional Sanger sequencing methods are of great significance in unraveling the complexity of plant genomes. These are constantly generating heaps of sequence data to be analyzed, annotated and stored. This has created a revolutionary demand for bioinformatics tools and software that can perform these functions. A large number of potentially useful bioinformatics tools and plant genome databases are created that have greatly simplified the analysis and storage of vast amounts of sequence data. The information garnered using the available bioinformatics methods have greatly helped in understanding the plant genome structure. Despite the availability of a good number of such tools, the information pouring from single gene-sequencing, and various whole-genome sequencing projects is overwhelming; thus, further innovations and improved methods are needed to sift through this sequence data, and assemble genomes. The current review focuses on diverse bioinformatics approaches and methods developed to systematically analyze and store plant sequence data. Finally, it outlines the bottlenecks in plant genome analysis, and some possible solutions that could be utilized to overcome the problems associated with plant genome analysis.


July 7, 2019

Genome sequencing brought Gossypium biology research into a new era.

The first sequenced diploid cotton genome was published in 2012 by the group led by the Institute of Cotton Research, Chinese Academy of Agricultural Sciences. Cotton genomics research subsequently entered a period of rapid development. The accumulating data have provided new insights into the evolution and domestication of cotton, the development of important agronomic traits, and strategies for improving cotton quality and production.


July 7, 2019

Bi-level error correction for PacBio long reads.

The latest sequencing technologies such as the Pacific Biosciences (PacBio) and Oxford Nanopore machines can generate long reads at the length of thousands of nucleic bases which is much longer than the reads at the length of hundreds generated by Illumina machines. However, these long reads are prone to much higher error rates, for example 15%, making downstream analysis and applications very difficult. Error correction is a process to improve the quality of sequencing data. Hybrid correction strategies have been recently proposed to combine Illumina reads of low error rates to fix sequencing errors in the noisy long reads with good performance. In this paper, we propose a new method named Bicolor, a bi-level framework of hybrid error correction for further improving the quality of PacBio long reads. At the first level, our method uses a de Bruijn graph-based error correction idea to search paths in pairs of solid -mers iteratively with an increasing length of -mer. At the second level, we combine the processed results under different parameters from the first level. In particular, a multiple sequence alignment algorithm is used to align those similar long reads, followed by a voting algorithm which determines the final base at each position of the reads. We compare the superior performance of Bicolor with three state-of-the-art methods on three real data sets. Results demonstrate that Bicolor always achieves the highest identity ratio. Bicolor also achieves a higher alignment ratio () and a higher number of aligned reads than the current methods on two data sets. On the third data set, our method is closely competitive to the current methods in terms of number of aligned reads and genome coverage. The C++ source codes of our algorithm are freely available at https://github.com/yuansliu/Bicolor.


July 7, 2019

Unlocking the biological potential of Euglena gracilis: evolution, cell biology and significance to parasitism

Photosynthetic euglenids are major components of aquatic ecosystems and relatives of trypanosomes. Euglena gracilis has considerable biotechnological potential and great adaptability, but exploitation remains hampered by the absence of a comprehensive gene catalogue. We address this by genome, RNA and protein sequencing: the E. gracilis genome is >2Gb, with 36,526 predicted proteins. Large lineage-specific paralog families are present, with evidence for flexibility in environmental monitoring, divergent mechanisms for metabolic control, and novel solutions for adaptation to extreme environments. Contributions from photosynthetic eukaryotes to the nuclear genome, consistent with the shopping bag model are found, together with transitions between kinetoplastid and canonical systems. Control of protein expression is almost exclusively post-transcriptional. These data are a major advance in understanding the nuclear genomes of euglenids and provide a platform for investigating the contributions of E. gracilis and its relatives to the biosphere.


July 7, 2019

The plastid genome in Cladophorales green algae is encoded by hairpin chromosomes.

Virtually all plastid (chloroplast) genomes are circular double-stranded DNA molecules, typically between 100 and 200 kb in size and encoding circa 80-250 genes. Exceptions to this universal plastid genome architecture are very few and include the dinoflagellates, where genes are located on DNA minicircles. Here we report on the highly deviant chloroplast genome of Cladophorales green algae, which is entirely fragmented into hairpin chromosomes. Short- and long-read high-throughput sequencing of DNA and RNA demonstrated that the chloroplast genes of Boodlea composita are encoded on 1- to 7-kb DNA contigs with an exceptionally high GC content, each containing a long inverted repeat with one or two protein-coding genes and conserved non-coding regions putatively involved in replication and/or expression. We propose that these contigs correspond to linear single-stranded DNA molecules that fold onto themselves to form hairpin chromosomes. The Boodlea chloroplast genes are highly divergent from their corresponding orthologs, and display an alternative genetic code. The origin of this highly deviant chloroplast genome most likely occurred before the emergence of the Cladophorales, and coincided with an elevated transfer of chloroplast genes to the nucleus. A chloroplast genome that is composed only of linear DNA molecules is unprecedented among eukaryotes, and highlights unexpected variation in plastid genome architecture. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 7, 2019

Mechanisms of adaptive divergence and speciation in Littorina saxatilis: Integrating knowledge from ecology and genetics with new data emerging from genomic studies

New opportunities to understand marine speciation and evolution of local adaptation come with genomic approaches and with the development of comprehensive model systems. The marine snail Littorina saxatilis is one example of a developing marine model for investigating genetic mechanisms of rapid divergence and evolution in natural systems. This species is strongly polymorphic and shows formation of local ecotypes throughout its distribution. Support is strong for primary (in situ) and parallel formation of reproductively semi-isolated ecotypes with contact zones between heterogeneous intertidal microhabitats. This makes this species an ideal organism for gaining new insights into the interplay of divergent selection, gene flow and genetic drift during local adaptation and speciation. A relatively well-resolved draft genome and a genetic map describing 17 linkage groups (“chromosomes”) are key tools for investigating the role of structural genomic variation, such as inversions, gene duplications and translocations. Whole genome re-sequencing of pools of individuals and the first comprehensive study of a contact zone contribute direct information on selection and barriers to gene flow present in specific regions of the genome. Linking selection at the phenotypic level to patterns obser ved in the genome is under way by quantitative trait loci mapping and annotation of candidate genes, while the role of single mutations on individual fitness will have to await development of gene manipulation tools. The features of the snail system facilitate the study of local adaptation and speciation and its genomic basis, but the underlying evolutionary processes are expected to be similar in other organisms, and hence this species is a useful model.


July 7, 2019

Genome sequence-based marker development and genotyping in potato

Potato (Solanum tuberosum L.) is one of the world’s most economically important food crops and holds major significance for future food security. Despite its importance, the study of potato genetics and breeding has lagged behind mainly due to its polyploid genome and high levels of heterozygosity. Conventional marker and genotyping approaches have been helpful in progressing potato genetic research but have also had limitations in exploiting the outcome from these studies for gene discovery and applied research applications. The sequencing of the potato genome, followed by advancements in marker and genotyping technologies, has brought a step change in the way potato genetic studies are conducted. Potato is now amenable to modern sequence-based marker and genotyping methods with their increased ability to put thousands of markers on any population of interest without a priori knowledge. This has increased the precision and resolution of genetic studies previously not feasible in potato. A diverse range of fixed and flexible genotyping platforms, for a wide variety of research and breeding applications, are now available. Concerted research efforts are now needed to screen the available genetic diversity for this important crop to identify novel and beneficial trait alleles in order to enable efficient and precise introgression breeding permitting breeding of climate smart, and resilient, potato cultivars. This chapter provides an overview of sequence-based marker development and genotyping methods along with their implications for potato research and breeding in the post-genomics era.


July 7, 2019

Complete genome sequence of Lactobacillus plantarum JBE245 isolated from Meju

Lactobacillus plantarum is widely found in fermented foods and has various phenotypic and genetic characteristics to adapt to the environment. Here we report the complete annotated genome sequence of the L. plantarum strain JBE245 (= KCCM43243) isolated for malolactic fermentation of apple juice. The genome comprises a single circular 3,262,611 bp chromosome with 2907 coding regions, 45 pseudogenes, and 91 RNA genes. The genome contains 4 malate dehydrogenase genes, 3 malate permease genes and various types of plantaricin-synthesizing genes. These genetic traits meet the selection criteria of the strains that should prevent the spoilage of apple juice during fermentation and efficiently convert malate to lactic acid.


July 7, 2019

Integrating transcriptomic and proteomic data for accurate assembly and annotation of genomes.

Complementing genome sequence with deep transcriptome and proteome data could enable more accurate assembly and annotation of newly sequenced genomes. Here, we provide a proof-of-concept of an integrated approach for analysis of the genome and proteome of Anopheles stephensi, which is one of the most important vectors of the malaria parasite. To achieve broad coverage of genes, we carried out transcriptome sequencing and deep proteome profiling of multiple anatomically distinct sites. Based on transcriptomic data alone, we identified and corrected 535 events of incomplete genome assembly involving 1196 scaffolds and 868 protein-coding gene models. This proteogenomic approach enabled us to add 365 genes that were missed during genome annotation and identify 917 gene correction events through discovery of 151 novel exons, 297 protein extensions, 231 exon extensions, 192 novel protein start sites, 19 novel translational frames, 28 events of joining of exons, and 76 events of joining of adjacent genes as a single gene. Incorporation of proteomic evidence allowed us to change the designation of more than 87 predicted “noncoding RNAs” to conventional mRNAs coded by protein-coding genes. Importantly, extension of the newly corrected genome assemblies and gene models to 15 other newly assembled Anopheline genomes led to the discovery of a large number of apparent discrepancies in assembly and annotation of these genomes. Our data provide a framework for how future genome sequencing efforts should incorporate transcriptomic and proteomic analysis in combination with simultaneous manual curation to achieve near complete assembly and accurate annotation of genomes.© 2017 Prasad et al.; Published by Cold Spring Harbor Laboratory Press.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.