Menu
April 21, 2020

Genomic erosion and extensive horizontal gene transfer in gut-associated Acetobacteraceae.

Symbiotic relationships between animals and bacteria have profound impacts on the evolutionary trajectories of each partner. Animals and gut bacteria engage in a variety of relationships, occasionally persisting over evolutionary timescales. Ants are a diverse group of animals that engage in many types of associations with taxonomically distinct groups of bacterial associates. Here, we bring into culture and characterize two closely-related strains of gut associated Acetobacteraceae (AAB) of the red carpenter ant, Camponotus chromaiodes.Genome sequencing, assembly, and annotation of both strains delineate stark patterns of genomic erosion and sequence divergence in gut associated AAB. We found widespread horizontal gene transfer (HGT) in these bacterial associates and report elevated gene acquisition associated with energy production and conversion, amino acid and coenzyme transport and metabolism, defense mechanisms, and lysine export. Both strains have acquired the complete NADH-quinone oxidoreductase complex, plausibly from an Enterobacteriaceae origin, likely facilitating energy production under diverse conditions. Conservation of several lysine biosynthetic and salvage pathways and accumulation of lysine export genes via HGT implicate L-lysine supplementation by both strains as a potential functional benefit for the host. These trends are contrasted by genome-wide erosion of several amino acid biosynthetic pathways and pathways in central metabolism. We perform phylogenomic analyses on both strains as well as several free living and host associated AAB. Based on their monophyly and deep divergence from other AAB, these C. chromaiodes gut associates may represent a novel genus. Together, our results demonstrate how extensive horizontal transfer between gut associates along with genome-wide deletions leads to mosaic metabolic pathways. More broadly, these patterns demonstrate that HGT and genomic erosion shape metabolic capabilities of persistent gut associates and influence their genomic evolution.Using comparative genomics, our study reveals substantial changes in genomic content in persistent associates of the insect gastrointestinal tract and provides evidence for the evolutionary pressures inherent to this environment. We describe patterns of genomic erosion and horizontal acquisition that result in mosaic metabolic pathways. Accordingly, the phylogenetic position of both strains of these associates form a divergent, monophyletic clade sister to gut associates of honey bees and more distantly to Gluconobacter.


April 21, 2020

Chromosome conformation capture resolved near complete genome assembly of broomcorn millet.

Broomcorn millet (Panicum miliaceum L.) has strong tolerance to abiotic stresses, and is probably one of the oldest crops, with its earliest cultivation that dated back to ca. ~10,000 years. We report here its genome assembly through a combination of PacBio sequencing, BioNano, and Hi-C (in vivo) mapping. The 18 super scaffolds cover ~95.6% of the estimated genome (~887.8?Mb). There are 63,671 protein-coding genes annotated in this tetraploid genome. About ~86.2% of the syntenic genes in foxtail millet have two homologous copies in broomcorn millet, indicating rare gene loss after tetraploidization in broomcorn millet. Phylogenetic analysis reveals that broomcorn millet and foxtail millet diverged around ~13.1 Million years ago (Mya), while the lineage specific tetraploidization of broomcorn millet may be happened within ~5.91 million years. The genome is not only beneficial for the genome assisted breeding of broomcorn millet, but also an important resource for other Panicum species.


April 21, 2020

Genome-wide profiling of the alternative splicing provides insights into development in Plutella xylostella.

The diamondback moth (DBM), Plutella xylostella (L.), is a major pest of cruciferous crops worldwide. While the species has become a model for genomics, post-transcriptional mechanisms associated with development and sex determination have not been comprehensively studied and the lack of complete structure of mRNA transcripts limits further research.Here, we combined the methods of single-molecule long-read sequencing technology (IsoSeq) and RNA-seq to re-annotate the published DBM genome and present the genome-wide identification of alternative splicing (AS) associated with development and sex determination of DBM. In total, we identified ~?13,900 genes (~?77%) annotated in the DBM genome (version-2), resulting in the correction of 1586 wrongly annotated genes and identification of 78,000 previously unannotated transcripts. We also identified 1804 genes showing alternative splicing (AS) in each of the developmental stages and sexes, suggesting that AS events are ubiquitous in DBM. Comparative analyses showed that these AS events were rarely shared among developmental stages, indicating that they may play key specific roles in regulation of insect development. Further, we found 156 genes showing different AS events and expression patterns between males and females, linking them to potential functions in sex determination.Overall, the P. xylostella transcriptome provides the significant information about regulatory alternative splicing events, which are shown to be involved in development and sex determination. Our work presents a solid foundation to better understand the mechanism of post-transcriptional regulation, and offers wider insights into insect development and sex determination.


April 21, 2020

A survey of transcriptome complexity using PacBio single-molecule real-time analysis combined with Illumina RNA sequencing for a better understanding of ricinoleic acid biosynthesis in Ricinus communis.

Ricinus communis is a highly economically valuable oil crop plant from the spurge family, Euphorbiaceae. However, the available reference genomes are incomplete and to date studies on ricinoleic acid biosynthesis at the transcriptional level are limited.In this study, we combined PacBio single-molecule long read isoform and Illumina RNA sequencing to identify the alternative splicing (AS) events, novel isoforms, fusion genes, long non-coding RNAs (lncRNAs) and alternative polyadenylation (APA) sites to unveil the transcriptomic complexity of castor beans and identify critical genes related to ricinoleic acid biosynthesis. Here, we identified 11,285 AS-variants distributed in 21,448 novel genes and detected 520 fusion genes, 320 lncRNAs and 9511 (APA-sites). Furthermore, a total of 6067, 5983 and 4058 differentially expressed genes between developing beans of the R. communis lines 349 and 1115 with extremely different oil content were identified at 7, 14 and 21?days after flowering, respectively. Specifically, 14, 18 and 11 DEGs were annotated encoding key enzymes related to ricinoleic acid biosynthesis reflecting the higher castor oil content of 1115 compared than 349. Quantitative real-time RT-PCR further validated fifteen of these DEGs at three-time points.Our results significantly improved the existed gene models of R. communis, and a putative model of key genes was built to show the differences between strains 349 and 1115, illustrating the molecular mechanism of castor oil biosynthesis. A multi-transcriptome database and candidate genes were provided to further improve the level of ricinoleic acid in transgenic crops.


April 21, 2020

Complete genome sequence of the Sulfodiicoccus acidiphilus strain HS-1T, the first crenarchaeon that lacks polB3, isolated from an acidic hot spring in Ohwaku-dani, Hakone, Japan.

Sulfodiicoccus acidiphilus HS-1T is the type species of the genus Sulfodiicoccus, a thermoacidophilic archaeon belonging to the order Sulfolobales (class Thermoprotei; phylum Crenarchaeota). While S. acidiphilus HS-1T shares many common physiological and phenotypic features with other Sulfolobales species, the similarities in their 16S rRNA gene sequences are less than 89%. In order to know the genomic features of S. acidiphilus HS-1T in the order Sulfolobales, we determined and characterized the genome of this strain.The circular genome of S. acidiphilus HS-1T is comprised of 2353,189 bp with a G+C content of 51.15 mol%. A total of 2459 genes were predicted, including 2411 protein coding and 48 RNA genes. The notable genomic features of S. acidiphilus HS-1T in Sulfolobales species are the absence of genes for polB3 and the autotrophic carbon fixation pathway, and the distribution pattern of essential genes and sequences related to genomic replication initiation. These insights contribute to an understanding of archaeal genomic diversity and evolution.


April 21, 2020

The genome of broomcorn millet.

Broomcorn millet (Panicum miliaceum L.) is the most water-efficient cereal and one of the earliest domesticated plants. Here we report its high-quality, chromosome-scale genome assembly using a combination of short-read sequencing, single-molecule real-time sequencing, Hi-C, and a high-density genetic map. Phylogenetic analyses reveal two sets of homologous chromosomes that may have merged ~5.6 million years ago, both of which exhibit strong synteny with other grass species. Broomcorn millet contains 55,930 protein-coding genes and 339 microRNA genes. We find Paniceae-specific expansion in several subfamilies of the BTB (broad complex/tramtrack/bric-a-brac) subunit of ubiquitin E3 ligases, suggesting enhanced regulation of protein dynamics may have contributed to the evolution of broomcorn millet. In addition, we identify the coexistence of all three C4 subtypes of carbon fixation candidate genes. The genome sequence is a valuable resource for breeders and will provide the foundation for studying the exceptional stress tolerance as well as C4 biology.


April 21, 2020

Whole Genome Analysis of Lactobacillus plantarum Strains Isolated From Kimchi and Determination of Probiotic Properties to Treat Mucosal Infections by Candida albicans and Gardnerella vaginalis.

Three Lactobacillus plantarum strains ATG-K2, ATG-K6, and ATG-K8 were isolated from Kimchi, a Korean traditional fermented food, and their probiotic potentials were examined. All three strains were free of antibiotic resistance, hemolysis, and biogenic amine production and therefore assumed to be safe, as supported by whole genome analyses. These strains demonstrated several basic probiotic functions including a wide range of antibacterial activity, bile salt hydrolase activity, hydrogen peroxide production, and heat resistance at 70°C for 60 s. Further studies of antimicrobial activities against Candida albicans and Gardnerella vaginalis revealed growth inhibitory effects from culture supernatants, coaggregation effects, and killing effects of the three probiotic strains, with better efficacy toward C. albicans. In vitro treatment of bacterial lysates of the probiotic strains to the RAW264.7 murine macrophage cell line resulted in innate immunity enhancement via IL-6 and TNF-a production without lipopolysaccharide (LPS) treatment and anti-inflammatory effects via significantly increased production of IL-10 when co-treated with LPS. However, the degree of probiotic effect was different for each strain as the highest TNF-a and the lowest IL-10 production by the RAW264.7 cell were observed in the K8 lysate treated group compared to the K2 and K6 lysate treated groups, which may be related to genomic differences such as chromosome size (K2: 3,034,884 bp, K6: 3,205,672 bp, K8: 3,221,272 bp), plasmid numbers (K2: 3, K6 and K8: 1), or total gene numbers (K2: 3,114, K6: 3,178, K8: 3,186). Although more correlative inspections to connect genomic information and biological functions are needed, genomic analyses of the three strains revealed distinct genomic compositions of each strain. Also, this finding suggests genome level analysis may be required to accurately identify microorganisms. Nevertheless, L. plantarum ATG-K2, ATG-K6, and ATG-K8 demonstrated their potential as probiotics for mucosal health improvement in both microbial and immunological contexts.


April 21, 2020

Pandoravirus Celtis Illustrates the Microevolution Processes at Work in the Giant Pandoraviridae Genomes.

With genomes of up to 2.7 Mb propagated in µm-long oblong particles and initially predicted to encode more than 2000 proteins, members of the Pandoraviridae family display the most extreme features of the known viral world. The mere existence of such giant viruses raises fundamental questions about their origin and the processes governing their evolution. A previous analysis of six newly available isolates, independently confirmed by a study including three others, established that the Pandoraviridae pan-genome is open, meaning that each new strain exhibits protein-coding genes not previously identified in other family members. With an average increment of about 60 proteins, the gene repertoire shows no sign of reaching a limit and remains largely coding for proteins without recognizable homologs in other viruses or cells (ORFans). To explain these results, we proposed that most new protein-coding genes were created de novo, from pre-existing non-coding regions of the G+C rich pandoravirus genomes. The comparison of the gene content of a new isolate, pandoravirus celtis, closely related (96% identical genome) to the previously described p. quercus is now used to test this hypothesis by studying genomic changes in a microevolution range. Our results confirm that the differences between these two similar gene contents mostly consist of protein-coding genes without known homologs, with statistical signatures close to that of intergenic regions. These newborn proteins are under slight negative selection, perhaps to maintain stable folds and prevent protein aggregation pending the eventual emergence of fitness-increasing functions. Our study also unraveled several insertion events mediated by a transposase of the hAT family, 3 copies of which are found in p. celtis and are presumably active. Members of the Pandoraviridae are presently the first viruses known to encode this type of transposase.


April 21, 2020

Long-Read Sequencing Emerging in Medical Genetics

The wide implementation of next-generation sequencing (NGS) technologies has revolutionized the field of medical genetics. However, the short read lengths of currently used sequencing approaches pose a limitation for identification of structural variants, sequencing repetitive regions, phasing alleles and distinguishing highly homologous genomic regions. These limitations may significantly contribute to the diagnostic gap in patients with genetic disorders who have undergone standard NGS, like whole exome or even genome sequencing. Now, the emerging long-read sequencing (LRS) technologies may offer improvements in the characterization of genetic variation and regions that are difficult to assess with the currently prevailing NGS approaches. LRS has so far mainly been used to investigate genetic disorders with previously known or strongly suspected disease loci. While these targeted approaches already show the potential of LRS, it remains to be seen whether LRS technologies can soon enable true whole genome sequencing routinely. Ultimately, this could allow the de novo assembly of individual whole genomes used as a generic test for genetic disorders. In this article, we summarize the current LRS-based research on human genetic disorders and discuss the potential of these technologies to facilitate the next major advancements in medical genetics.


April 21, 2020

GAPPadder: a sensitive approach for closing gaps on draft genomes with short sequence reads.

Closing gaps in draft genomes is an important post processing step in genome assembly. It leads to more complete genomes, which benefits downstream genome analysis such as annotation and genotyping. Several tools have been developed for gap closing. However, these tools don’t fully utilize the information contained in the sequence data. For example, while it is known that many gaps are caused by genomic repeats, existing tools often ignore many sequence reads that originate from a repeat-related gap.We compare GAPPadder with GapCloser, GapFiller and Sealer on one bacterial genome, human chromosome 14 and the human whole genome with paired-end and mate-paired reads with both short and long insert sizes. Empirical results show that GAPPadder can close more gaps than these existing tools. Besides closing gaps on draft genomes assembled only from short sequence reads, GAPPadder can also be used to close gaps for draft genomes assembled with long reads. We show GAPPadder can close gaps on the bed bug genome and the Asian sea bass genome that are assembled partially and fully with long reads respectively. We also show GAPPadder is efficient in both time and memory usage.In this paper, we propose a new approach called GAPPadder for gap closing. The main advantage of GAPPadder is that it uses more information in sequence data for gap closing. In particular, GAPPadder finds and uses reads that originate from repeat-related gaps. We show that these repeat-associated reads are useful for gap closing, even though they are ignored by all existing tools. Other main features of GAPPadder include utilizing the information in sequence reads with different insert sizes and performing two-stage local assembly of gap sequences. The results show that our method can close more gaps than several existing tools. The software tool, GAPPadder, is available for download at https://github.com/Reedwarbler/GAPPadder .


April 21, 2020

A new reference genome for Sorghum bicolor reveals high levels of sequence similarity between sweet and grain genotypes: implications for the genetics of sugar metabolism.

The process of crop domestication often consists of two stages: initial domestication, where the wild species is first cultivated by humans, followed by diversification, when the domesticated species are subsequently adapted to more environments and specialized uses. Selective pressure to increase sugar accumulation in certain varieties of the cereal crop Sorghum bicolor is an excellent example of the latter; this has resulted in pronounced phenotypic divergence between sweet and grain-type sorghums, but the genetic mechanisms underlying these differences remain poorly understood.Here we present a new reference genome based on an archetypal sweet sorghum line and compare it to the current grain sorghum reference, revealing a high rate of nonsynonymous and potential loss of function mutations, but few changes in gene content or overall genome structure. We also use comparative transcriptomics to highlight changes in gene expression correlated with high stalk sugar content and show that changes in the activity and possibly localization of transporters, along with the timing of sugar metabolism play a critical role in the sweet phenotype.The high level of genomic similarity between sweet and grain sorghum reflects their historical relatedness, rather than their current phenotypic differences, but we find key changes in signaling molecules and transcriptional regulators that represent new candidates for understanding and improving sugar metabolism in this important crop.


April 21, 2020

Magic-BLAST, an accurate RNA-seq aligner for long and short reads.

Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline.Magic-BLAST uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome.We show that Magic-BLAST is the best at intron discovery over a wide range of conditions and the best at mapping reads longer than 250 bases, from any platform. It is versatile and robust to high levels of mismatches or extreme base composition, and reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI.


April 21, 2020

Identification of Diverse Integron and Plasmid Structures Carrying a Novel Carbapenemase Among Pseudomonas Species.

A novel carbapenem-hydrolyzing beta-lactamase, called IMP-63, was identified in three clonally distinct strains of Pseudomonas aeruginosa and two strains of Pseudomonas putida isolated within a 4 year timeframe in three French hospitals. The blaIMP-63 gene that encodes this carbapenemase turned out to be located in the variable region of four integrons (In1297, In1574, In1573, and In1572) and to coexist with novel or rare gene cassettes (fosM, gcu170, gcuF1) and insertion elements (ISPsp7v, ISPa16v). All these integrons except one (In1574) were flanked by a copy of insertion sequence ISPa17 next to the orf6 putative gene, and were carried by non-conjugative plasmids (pNECK1, pROUSS1, pROUSS2, pROUE1). These plasmids exhibit unique modular structures and partial sequence homologies with plasmids previously identified in various non-fermenting environmental Gram-negative species. Lines of evidence suggest that ISPa17 promoted en bloc the transposition of IMP-63-encoding integrons on these different plasmids. As demonstrated by genotyping experiments, isolates of P. aeruginosa harboring the 28.9-kb plasmid pNECK1 and belonging to international “high-risk” clone ST308 were responsible for an outbreak in one hospital. Collectively, these data provide an insight into the complex and unpredictable routes of diffusion of some resistance determinants, here blaIMP-63, among Pseudomonas species.


April 21, 2020

Exploring the landscape of focal amplifications in cancer using AmpliconArchitect.

Focal oncogene amplification and rearrangements drive tumor growth and evolution in multiple cancer types. We present AmpliconArchitect (AA), a tool to reconstruct the fine structure of focally amplified regions using whole genome sequencing (WGS) and validate it extensively on multiple simulated and real datasets, across a wide range of coverage and copy numbers. Analysis of AA-reconstructed amplicons in a pan-cancer dataset reveals many novel properties of copy number amplifications in cancer. These findings support a model in which focal amplifications arise due to the formation and replication of extrachromosomal DNA. Applying AA to 68 viral-mediated cancer samples, we identify a large fraction of amplicons with specific structural signatures suggestive of hybrid, human-viral extrachromosomal DNA. AA reconstruction, integrated with metaphase fluorescence in situ hybridization (FISH) and PacBio sequencing on the cell-line UPCI:SCC090 confirm the extrachromosomal origin and fine structure of a Forkhead box E1 (FOXE1)-containing hybrid amplicon.


April 21, 2020

Reviving the Transcriptome Studies: An Insight into the Emergence of Single-molecule Transcriptome Sequencing

Advances in transcriptomics have provided an exceptional opportunity to study functional implications of the genetic variability. Technologies such as RNA-Seq have emerged as state-of-the-art techniques for transcriptome analysis that take advantage of high-throughput next-generation sequencing. However, similar to their predecessors, these approaches continue to impose major challenges on full-length transcript structure identification, primarily due to inherent limitations of read length. With the development of single-molecule sequencing (SMS) from PacBio, a growing number of studies on the transcriptome of different organisms have been reported. SMS has emerged as advantageous for comprehensive genome annotation including identification of novel genes/isoforms, long non-coding RNAs and fusion transcripts. This approach can be used across a broad spectrum of species to better interpret the coding information of the genome, and facilitate the biological function study. We provide an overview of SMS platform and its diverse applications in various biological studies, and our perspective on the challenges associated with the transcriptome studies.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.