Menu
September 22, 2019  |  

Single molecule, full-length transcript sequencing provides insight into the extreme metabolism of ruby-throated hummingbird Archilochus colubris

Hummingbirds oxidize ingested nectar sugars directly to fuel foraging but cannot sustain this fuel use during fasting periods, such as during the night or during long-distance migratory flights. Instead, fasting hummingbirds switch to oxidizing stored lipids, derived from ingested sugars. The hummingbird liver plays a key role in moderating energy homeostasis and this remarkable capacity for fuel switching. Additionally, liver is the principle location of de novo lipogenesis, which can occur at exceptionally high rates, such as during premigratory fattening. Yet understanding how this tissue and whole organism moderates energy turnover is hampered by a lack of information regarding how relevant enzymes differ in sequence, expression, and regulation. We generated a de novo transcriptome of the hummingbird liver using PacBio full-length cDNA sequencing (Iso-Seq), yielding a total of 8.6Gb of sequencing data, or 2.6M reads from 4 different size fractions. We analyzed data using the SMRTAnalysis v3.1 Iso-Seq pipeline, then clustered isoforms into gene families to generate de novo gene contigs using Cogent. We performed orthology analysis to identify closely related sequences between our transcriptome and other avian and human gene sets. Finally, we closely examined homology of critical lipid metabolism genes between our transcriptome data and avian and human genomes. We confirmed high levels of sequence divergence within hummingbird lipogenic enzymes, suggesting a high probability of adaptive divergent function in the hepatic lipogenic pathways. Our results leverage cutting-edge technology and a novel bioinformatics pipeline to provide a first direct look at the transcriptome of this incredible organism.


September 22, 2019  |  

High resolution annotation of zebrafish transcriptome using long-read sequencing.

With the emergence of zebrafish as an important model organism, a concerted effort has been made to study its transcriptome. This effort is limited, however, by gaps in zebrafish annotation, which are especially pronounced concerning transcripts dynamically expressed during zygotic genome activation (ZGA). To date, short-read sequencing has been the principal technology for zebrafish transcriptome annotation. In part because these sequence reads are too short for assembly methods to resolve the full complexity of the transcriptome, the current annotation is rudimentary. By providing direct observation of full-length transcripts, recently refined long-read sequencing platforms can dramatically improve annotation coverage and accuracy. Here, we leveraged the SMRT platform to study the transcriptome of zebrafish embryos before and after ZGA. Our analysis revealed additional novelty and complexity in thehttps://www.ncbi.nlm.nih.gov/pubmed/nfidence novel transcripts that originated from previously unannotated loci and 1835 high-confidence new isoforms in previously annotated genes. We validated these findings using a suite of computational approaches including structural prediction, sequence homology, and functional conservation analyses, as well as by confirmatory transcript quantification with short-read sequencing data. Our analyses provided insight into new homologs and paralogs of functionally important proteins and noncoding RNAs, isoform switching occurrences, and different classes of novel splicing events. Several novel isoforms representing distinct splicing events were validated through PCR experiments, including the discovery and validation of a novel 8-kb transcript spanning multiple mir-430 elements, an important driver of early development. Our study provides a significantly improved zebrafish transcriptome annotation resource.© 2018 Nudelman et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019  |  

Improved high-quality genome assembly and annotation of Tibetan hulless barley

Background The Tibetan hulless barley (Hordeum vulgare L. var. nudum), also called textquotedblleftQingketextquotedblright in Chinese and textquotedblleftNetextquotedblright in Tibetan, is the staple food for Tibetans and an important livestock feed in the Tibetan Plateau. The Tibetan hulless barley in China has about 3500 years of cultivation history, mainly produced in Tibet, Qinghai, Sichuan, Yunnan and other areas. In addition, Tibetan hulless barley has rich nutritional value and outstanding health effects, including the beta glucan, dietary fiber, amylopectin, the contents of trace elements, which are higher than any other cereal crops.Findings Here, we reported an improved high-quality assembly of Tibetan hulless barley genome with 4.0 Gb in size. We employed the falcon assembly package, scaffolding and error correction tools to finish improvement using PacBio long reads sequencing technology, with contig and scaffold N50 lengths of 1.563Mb and 4.006Mb, respectively, representing more continuous than the original Tibetan hulless barley genome nearly two orders of magnitude. We also re-annotated the new assembly, and reported 61,303 stringent confident putative protein-coding genes, of which 40,457 is HC genes. We have developed a new Tibetan hulless barley genome database (THBGD) to download and use friendly, as well as to better manage the information of the Tibetan hulless barley genetic resources.Conclusions The availability of new Tibetan hulless barley genome and annotations will take the genetics of Tibetan hulless barley to a new level and will greatly simplify the breeders effort. It will also enrich the granary of the Tibetan people.AbbreviationsBLASTBasic Local Alignment Search ToolBUSCOBenchmarking Universal Single-Copy OrthologsQVquality valuePacBioPacifc BiosciencesRNA-seqRNA sequencingNGSNext generation sequencingTGSThird generation sequencingTHBGDTibetan hulless barley Genome Database


September 22, 2019  |  

A survey of the sorghum transcriptome using single-molecule long reads.

Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novel splice isoforms. Additionally, we uncover APA of ~11,000 expressed genes and more than 2,100 novel genes. These results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism.


September 22, 2019  |  

Long read reference genome-free reconstruction of a full-length transcriptome from Astragalus membranaceus reveals transcript variants involved in bioactive compound biosynthesis.

Astragalus membranaceus, also known as Huangqi in China, is one of the most widely used medicinal herbs in Traditional Chinese Medicine. Traditional Chinese Medicine formulations from Astragalus membranaceus have been used to treat a wide range of illnesses, such as cardiovascular disease, type 2 diabetes, nephritis and cancers. Pharmacological studies have shown that immunomodulating, anti-hyperglycemic, anti-inflammatory, antioxidant and antiviral activities exist in the extract of Astragalus membranaceus. Therefore, characterising the biosynthesis of bioactive compounds in Astragalus membranaceus, such as Astragalosides, Calycosin and Calycosin-7-O-ß-d-glucoside, is of particular importance for further genetic studies of Astragalus membranaceus. In this study, we reconstructed the Astragalus membranaceus full-length transcriptomes from leaf and root tissues using PacBio Iso-Seq long reads. We identified 27 975 and 22 343 full-length unique transcript models in each tissue respectively. Compared with previous studies that used short read sequencing, our reconstructed transcripts are longer, and are more likely to be full-length and include numerous transcript variants. Moreover, we also re-characterised and identified potential transcript variants of genes involved in Astragalosides, Calycosin and Calycosin-7-O-ß-d-glucoside biosynthesis. In conclusion, our study provides a practical pipeline to characterise the full-length transcriptome for species without a reference genome and a useful genomic resource for exploring the biosynthesis of active compounds in Astragalus membranaceus.


September 22, 2019  |  

Single molecule RNA sequencing uncovers trans-splicing and improves annotations in Anopheles stephensi.

Single molecule real-time (SMRT) sequencing has recently been used to obtain full-length cDNA sequences that improve genome annotation and reveal RNA isoforms. Here, we used one such method called isoform sequencing from Pacific Biosciences (PacBio) to sequence a cDNA library from the Asian malaria mosquito Anopheles stephensi. More than 600 000 full-length cDNAs, referred to as reads of insert, were identified. Owing to the inherently high error rate of PacBio sequencing, we tested different approaches for error correction. We found that error correction using Illumina RNA sequencing (RNA-seq) generated more data than using the default SMRT pipeline. The full-length error-corrected PacBio reads greatly improved the gene annotation of Anopheles stephensi: 4867 gene models were updated and 1785 alternatively spliced isoforms were added to the annotation. In addition, six trans-splicing events, where exons from different primary transcripts were joined together, were identified in An. stephensi. All six trans-splicing events appear to be conserved in Culicidae, as they are also found in Anopheles gambiae and Aedes aegypti. The proteins encoded by trans-splicing events are also highly conserved and the orthologues of these proteins are cis-spliced in outgroup species, indicating that trans-splicing may arise as a mechanism to rescue genes that broke up during evolution.© 2017 The Royal Entomological Society.


September 22, 2019  |  

Comparative mapping of the ASTRINGENCY locus controlling fruit astringency in hexaploid persimmon (Diospyros kaki Thunb.) with the diploid D. lotus reference genome

Persimmon (Diospyros kaki) is a tree crop species that originated in East Asia, consists mainly of hexaploid individuals (2n = 6x = 90) with some nonaploid individuals. One of the unique characteristics of persimmon is the continuous accumulation of proanthocyanidins (PAs) in its fruit until the middle of fruit development, resulting in a strong astringent taste even at commercial fruit maturity. Among persimmon cultivars, pollination-constant and non-astringent (PCNA) types cease PA accumulation in early fruit development and become non-astringent at commercial maturity. PCNA is an allelic trait to non-PCNA and is controlled by a single locus called the ASTRINGENCY (AST) locus. Previous segregation analyses indicated that the AST locus shows hexasomic inheritance; a recessive allele, ast, at this locus confers PCNA. Here, we report a shuttle mapping approach to delimit the AST locus region in the hexaploid persimmon genome by using D. lotus, a diploid relative of D. kaki, as a reference. A D. lotus F1 population of 333 individuals and 296 D. kaki siblings segregating for the PCNA trait were used to map the AST region using haplotype-specific markers covering the AST region. This indicated that the AST locus is syntenic to an approximately 915-kb region of the D. lotus genome. In this 915-kb region, we found several candidates for AST that were revealed from the fruit transcriptome of a population segregating for the PCNA trait. These results could provide important clues for the isolation of AST in hexaploid persimmon.


September 22, 2019  |  

An ancient integration in a plant NLR is maintained as a trans-species polymorphism

Plant immune receptors are under constant selective pressure to maintain resistance to plant pathogens. Nucleotide-binding leucine-rich repeat (NLR) proteins are one class of cytoplasmic immune receptors whose genes commonly show signatures of adaptive evolution. While it is known that balancing selection contributes to maintaining high intraspecific allelic diversity, the evolutionary mechanism that influences the transmission of alleles during speciation remains unclear. The barley Mla locus has over 30 described alleles conferring isolate-specific resistance to barley powdery mildew and contains three NLR families (RGH1, RGH2, and RGH3). We discovered (using sequence capture and RNAseq) the presence of a novel integrated Exo70 domain in RGH2 in the Mla3 haplotype. Allelic variation across barley accessions includes presence/absence of the integrated domain in RGH2. Expanding our search to several Poaceae species, we found shared interspecific conservation in the RGH2-Exo70 integration. We hypothesise that balancing selection has maintained allelic variation at Mla as a trans-species polymorphism over 24 My, thus contributing to and preserving interspecific allelic diversity during speciation.


September 22, 2019  |  

Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza.

The genus Oryza is a model system for the study of molecular evolution over time scales ranging from a few thousand to 15 million years. Using 13 reference genomes spanning the Oryza species tree, we show that despite few large-scale chromosomal rearrangements rapid species diversification is mirrored by lineage-specific emergence and turnover of many novel elements, including transposons, and potential new coding and noncoding genes. Our study resolves controversial areas of the Oryza phylogeny, showing a complex history of introgression among different chromosomes in the young ‘AA’ subclade containing the two domesticated species. This study highlights the prevalence of functionally coupled disease resistance genes and identifies many new haplotypes of potential use for future crop protection. Finally, this study marks a milestone in modern rice research with the release of a complete long-read assembly of IR 8 ‘Miracle Rice’, which relieved famine and drove the Green Revolution in Asia 50 years ago.


September 22, 2019  |  

Genome sequences of Chlorella sorokiniana UTEX 1602 and Micractinium conductrix SAG 241.80: implications to maltose excretion by a green alga.

Green algae represent a key segment of the global species capable of photoautotrophic-driven biological carbon fixation. Algae partition fixed-carbon into chemical compounds required for biomass, while diverting excess carbon into internal storage compounds such as starch and lipids or, in certain cases, into targeted extracellular compounds. Two green algae were selected to probe for critical components associated with sugar production and release in a model alga. Chlorella sorokiniana UTEX 1602 – which does not release significant quantities of sugars to the extracellular space – was selected as a control to compare with the maltose-releasing Micractinium conductrix SAG 241.80 – which was originally isolated from an endosymbiotic association with the ciliate Paramecium bursaria. Both strains were subjected to three sequencing approaches to assemble their genomes and annotate their genes. This analysis was further complemented with transcriptional studies during maltose release by M. conductrix SAG 241.80 versus conditions where sugar release is minimal. The annotation revealed that both strains contain homologs for the key components of a putative pathway leading to cytosolic maltose accumulation, while transcriptional studies found few changes in mRNA levels for the genes associated with these established intracellular sugar pathways. A further analysis of potential sugar transporters found multiple homologs for SWEETs and tonoplast sugar transporters. The analysis of transcriptional differences revealed a lesser and more measured global response for M. conductrix SAG 241.80 versus C. sorokiniana UTEX 1602 during conditions resulting in sugar release, providing a catalog of genes that might play a role in extracellular sugar transport.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.


September 22, 2019  |  

Sequence analysis of European maize inbred line F2 provides new insights into molecular and chromosomal characteristics of presence/absence variants.

Maize is well known for its exceptional structural diversity, including copy number variants (CNVs) and presence/absence variants (PAVs), and there is growing evidence for the role of structural variation in maize adaptation. While PAVs have been described in this important crop species, they have been only scarcely characterized at the sequence level and the extent of presence/absence variation and relative chromosomal landscape of inbred-specific regions remain to be elucidated.De novo genome sequencing of the French F2 maize inbred line revealed 10,044 novel genomic regions larger than 1 kb, making up 88 Mb of DNA, that are present in F2 but not in B73 (PAV). This set of maize PAV sequences allowed us to annotate PAV content and to analyze sequence breakpoints. Using PAV genotyping on a collection of 25 temperate lines, we also analyzed Linkage Disequilibrium in PAVs and flanking regions, and PAV frequencies within maize genetic groups.We highlight the possible role of MMEJ-type double strand break repair in maize PAV formation and discover 395 new genes with transcriptional support. Pattern of linkage disequilibrium within PAVs strikingly differs from this of flanking regions and is in accordance with the intuition that PAVs may recombine less than other genomic regions. We show that most PAVs are ancient, while some are found only in European Flint material, thus pinpointing structural features that may be at the origin of adaptive traits involved in the success of this material. Characterization of such PAVs will provide useful material for further association genetic studies in European and temperate maize.


September 22, 2019  |  

Bacterial artificial chromosome clones randomly selected for sequencing reveal genomic differences between soybean cultivars

This study pioneered the use of multiple technologies to combine the bacterial artificial chromosome (BAC) pooling strategy with high-throughput next- and third-generation sequencing technologies to analyse genomic difference. To understand the genetic background of the Chinese soybean cultivar N23601, we built a BAC library and sequenced 10 randomly selected clones followed by de novo assembly. Comparative analysis was conducted against the reference genome of Glycine max var. Williams 82 (2.0). Therefore, our result is an assessment of the reference genome. Our results revealed that 3517 single nucleotide polymorphisms (SNPs) and 662 insertion–deletions (InDels) occurred in ~1.2 Mb of the genomic region and that four of the 10 BAC clones contained 15 large structural variations (72?887?bp) compared with the reference genome. Gene annotation of the reference genome showed that Glyma.18g181000 was missing from the corresponding position of the 10 BAC clones. Additionally, there may be a problem with the assembly of some positions of the reference genome. Several gap regions in the reference genome could be supplemented by using the complete sequence of the 10 BAC clones. We believe that accurate and complete BAC sequence is a valuable resource that contributes to the completeness of the reference genome.


September 22, 2019  |  

The hardy rubber tree genome provides insights into the evolution of polyisoprene biosynthesis.

Eucommia ulmoides, also called hardy rubber tree, is an economically important tree; however, the lack of its genome sequence restricts the fundamental biological research and applied studies of this plant species. Here, we present a high-quality assembly of its ~1.2-Gb genome (scaffold N50 = 1.88 Mb) with at least 26 723 predicted genes for E. ulmoides, the first sequenced genome of the order Garryales, which was obtained using an integrated strategy combining Illumina sequencing, PacBio sequencing, and BioNano mapping. As a sister taxon to lamiids and campanulids, E. ulmoides underwent an ancient genome triplication shared by core eudicots but no further whole-genome duplication in the last ~125 million years. E. ulmoides exhibits high expression levels and/or gene number expansion for multiple genes involved in stress responses and the biosynthesis of secondary metabolites, which may account for its considerable environmental adaptability. In contrast to the rubber tree (Hevea brasiliensis), which produces cis-polyisoprene, E. ulmoides has evolved to synthesize long-chain trans-polyisoprene via farnesyl diphosphate synthases (FPSs). Moreover, FPS and rubber elongation factor/small rubber particle protein gene families were expanded independently from the H. brasiliensis lineage. These results provide new insights into the biology of E. ulmoides and the origin of polyisoprene biosynthesis. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.


September 22, 2019  |  

Dynamic evolution of a-gliadin prolamin gene family in homeologous genomes of hexaploid wheat.

Wheat Gli-2 loci encode complex groups of a-gliadin prolamins that are important for breadmaking, but also major triggers of celiac disease (CD). Elucidation of a-gliadin evolution provides knowledge to produce wheat with better end-use properties and reduced immunogenic potential. The Gli-2 loci contain a large number of tandemly duplicated genes and highly repetitive DNA, making sequence assembly of their genomic regions challenging. Here, we constructed high-quality sequences spanning the three wheat homeologous a-gliadin loci by aligning PacBio-based sequence contigs with BioNano genome maps. A total of 47 a-gliadin genes were identified with only 26 encoding intact full-length protein products. Analyses of a-gliadin loci and phylogenetic tree reconstruction indicate significant duplications of a-gliadin genes in the last ~2.5 million years after the divergence of the A, B and D genomes, supporting its rapid lineage-independent expansion in different Triticeae genomes. We showed that dramatic divergence in expression of a-gliadin genes could not be attributed to sequence variations in the promoter regions. The study also provided insights into the evolution of CD epitopes and identified a single indel event in the hexaploid wheat D genome that likely resulted in the generation of the highly toxic 33-mer CD epitope.


September 22, 2019  |  

Targeted sequencing by gene synteny, a new strategy for polyploid species: sequencing and physical structure of a complex sugarcane region.

Sugarcane exhibits a complex genome mainly due to its aneuploid nature and high ploidy level, and sequencing of its genome poses a great challenge. Closely related species with well-assembled and annotated genomes can be used to help assemble complex genomes. Here, a stable quantitative trait locus (QTL) related to sugar accumulation in sorghum was successfully transferred to the sugarcane genome. Gene sequences related to this QTL were identified in silico from sugarcane transcriptome data, and molecular markers based on these sequences were developed to select bacterial artificial chromosome (BAC) clones from the sugarcane variety SP80-3280. Sixty-eight BAC clones containing at least two gene sequences associated with the sorghum QTL were sequenced using Pacific Biosciences (PacBio) technology. Twenty BAC sequences were found to be related to the syntenic region, of which nine were sufficient to represent this region. The strategy we propose is called “targeted sequencing by gene synteny,” which is a simpler approach to understanding the genome structure of complex genomic regions associated with traits of interest.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.