January 15, 2018

PAG Conference: Long-read sequencing reveals complex genomic architecture in independent carnivorous plant lineages

In this PAG 2018 presentation, Tanya Renner of Pennsylvania State University shares research using PacBio SMRT Sequencing to understand the genomes and transcriptomes of carnivorous plants. She describes the humped bladderwort, Utricularia gibba, as having an extreme genome due to its small size (100 Mbp) despite containing numerous tandem gene duplications and having undergone two whole genome duplications. Renner shares ongoing research into two Drosera species, commonly known as sundews, which through whole genome sequencing are illuminating carnivorous plant genome structural evolution including the transition from monocentric to holocentric chromosomes.

November 9, 2017

Copy number variation and expression analysis reveals a nonorthologous pinta gene family member involved in butterfly vision.

Vertebrate (cellular retinaldehyde-binding protein) and Drosophila (prolonged depolarization afterpotential is not apparent [PINTA]) proteins with a CRAL-TRIO domain transport retinal-based chromophores that bind to opsin proteins and are necessary for phototransduction. The CRAL-TRIO domain gene family is composed of genes that encode proteins with a common N-terminal structural domain. Although there is an expansion of this gene family in Lepidoptera, there is no lepidopteran ortholog of pinta. Further, the function of these genes in lepidopterans has not yet been established. Here, we explored the molecular evolution and expression of CRAL-TRIO domain genes in the butterfly Heliconius melpomene in order to…

July 19, 2017

Hybrid assembly with long and short reads improves discovery of gene family expansions.

Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation.We developed a hybrid assembly pipeline called "Alpaca" that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation.Compared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and…

June 1, 2017

Iterative optimization of xylose catabolism in Saccharomyces cerevisiae using combinatorial expression tuning.

A common challenge in metabolic engineering is rapidly identifying rate-controlling enzymes in heterologous pathways for subsequent production improvement. We demonstrate a workflow to address this challenge and apply it to improving xylose utilization in Saccharomyces cerevisiae. For eight reactions required for conversion of xylose to ethanol, we screened enzymes for functional expression in S. cerevisiae, followed by a combinatorial expression analysis to achieve pathway flux balancing and identification of limiting enzymatic activities. In the next round of strain engineering, we increased the copy number of these limiting enzymes and again tested the eight-enzyme combinatorial expression library in this new background.…

May 26, 2017

Tandem duplications lead to novel expression patterns through exon shuffling in Drosophila yakuba.

One common hypothesis to explain the impacts of tandem duplications is that whole gene duplications commonly produce additive changes in gene expression due to copy number changes. Here, we use genome wide RNA-seq data from a population sample of Drosophila yakuba to test this 'gene dosage' hypothesis. We observe little evidence of expression changes in response to whole transcript duplication capturing 5' and 3' UTRs. Among whole gene duplications, we observe evidence that dosage sharing across copies is likely to be common. The lack of expression changes after whole gene duplication suggests that the majority of genes are subject to…

October 20, 2016

Characterizing haplotype diversity at the immunoglobulin heavy chain locus across human populations using novel long-read sequencing and assembly approaches

The human immunoglobulin heavy chain locus (IGH) remains among the most understudied regions of the human genome. Recent efforts have shown that haplotype diversity within IGH is elevated and exhibits population specific patterns; for example, our re-sequencing of the locus from only a single chromosome uncovered >100 Kb of novel sequence, including descriptions of six novel alleles, and four previously unmapped genes. Historically, this complex locus architecture has hindered the characterization of IGH germline single nucleotide, copy number, and structural variants (SNVs; CNVs; SVs), and as a result, there remains little known about the role of IGH polymorphisms in inter-individual…

October 12, 2016

Genetic basis of priority effects: insights from nectar yeast.

Priority effects, in which the order of species arrival dictates community assembly, can have a major influence on species diversity, but the genetic basis of priority effects remains unknown. Here, we suggest that nitrogen scavenging genes previously considered responsible for starvation avoidance may drive priority effects by causing rapid resource depletion. Using single-molecule sequencing, we de novo assembled the genome of the nectar-colonizing yeast, Metschnikowia reukaufii, across eight scaffolds and complete mitochondrion, with gap-free coverage over gene spaces. We found a high rate of tandem gene duplication in this genome, enriched for nitrogen metabolism and transport. Both high-capacity amino acid…

December 18, 2015

Case Study: With SMRT Sequencing for genomes, transcriptomes, and epigenomes, scientists are overcoming barriers in plant and animal research

Scientists are utilizing long-read PacBio sequencing to provide uniquely comprehensive views of complex plant and animal genomes. These efforts are uncovering novel biological mechanisms, enabling progress in crop development, and much more. To date, scientists have published over 1000 papers with Single Molecule, Real-Time (SMRT) Sequencing, many covering breakthroughs in the plant and animal sciences. In this case study, we look at examples in model organisms Drosophila and C. elegans and non-model organisms coffee, Oropeitum, danshen, and sugarbeet, where SMRT Sequencing has contributed to a more accurate understanding of biology. These efforts underscore the broad applicability of long-read sequencing in…

