2015 SMRT Informatics Developers Conference Presentation Slides: Shinichi Morishita of the University of Tokyo presented on how his team has been using SMRT Sequencing to better understand methylomes, metagenomes and structural variation of various eukaryotic genomes.
A single gene may encode a surprising number of proteins, each with a distinct biological function. This is especially true in complex eukaryotes. Short- read RNA sequencing (RNA-seq) works by physically shearing transcript isoforms into smaller pieces and bioinformatically reassembling them, leaving opportunity for misassembly or incomplete capture of the full diversity of isoforms from genes of interest. The PacBio Isoform Sequencing (Iso-Seq™) method employs long reads to sequence transcript isoforms from the 5’ end to their poly-A tails, eliminating the need for transcript reconstruction and inference. These long reads result in complete, unambiguous information about alternatively spliced exons, transcriptional…
The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and…
p53 is one of the most important tumour suppressor proteins currently known. It is activated in response to DNA damage and this activation leads to proliferation arrest and cell death. The abundance and activity of p53 are tightly controlled and reductions in p53’s activity can contribute to the development of cancer. Here, we show that Fam83F increases p53 protein levels by protein stabilisation. Fam83F interacts with p53 and decreases its ubiquitination and degradation. Fam83F is induced in response to DNA damage and its overexpression also increases p53 activity in cell culture experiments and in zebrafish embryos. Downregulation of Fam83F decreases…
The severity of environmental conditions at Earth’s frigid zones present attractive opportunities for microbial biomining due to their heightened potential as reservoirs for novel secondary metabolites. Arid soil microbiomes within the Antarctic and Arctic circles are remarkably rich in Actinobacteria and Proteobacteria, bacterial phyla known to be prolific producers of natural products. Yet the diversity of secondary metabolite genes within these cold, extreme environments remain largely unknown. Here, we employed amplicon sequencing using PacBio RS II, a third generation long-read platform, to survey over 200 soils spanning twelve east Antarctic and high Arctic sites for natural product-encoding genes, specifically targeting…
African cichlid fishes are well known for their rapid radiations and are a model system for studying evolutionary processes. Here we compare multiple, high-quality, chromosome-scale genome assemblies to elucidate the genetic mechanisms underlying cichlid diversification and study how genome structure evolves in rapidly radiating lineages.We re-anchored our recent assembly of the Nile tilapia (Oreochromis niloticus) genome using a new high-density genetic map. We also developed a new de novo genome assembly of the Lake Malawi cichlid, Metriaclima zebra, using high-coverage Pacific Biosciences sequencing, and anchored contigs to linkage groups (LGs) using 4 different genetic maps. These new anchored assemblies allow…
Supernumerary B chromosomes (Bs) are extra karyotype units in addition to A chromosomes, and are found in some fungi and thousands of animals and plant species. Bs are uniquely characterized due to their non-Mendelian inheritance, and represent one of the best examples of genomic conflict. Over the last decades, their genetic composition, function and evolution have remained an unresolved query, although a few successful attempts have been made to address these phenomena. A classical concept based on cytogenetics and genetics is that Bs are selfish and abundant with DNA repeats and transposons, and in most cases, they do not carry…
For over a thousand years, the common goldfish (Carassius auratus) was raised throughout Asia for food and as an ornamental pet. As a very close relative of the common carp (Cyprinus carpio), goldfish share the recent genome duplication that occurred approximately 14 million years ago in their common ancestor. The combination of centuries of breeding and a wide array of interesting body morphologies provides an exciting opportunity to link genotype to phenotype and to understand the dynamics of genome evolution and speciation. We generated a high-quality draft sequence and gene annotations of a “Wakin” goldfish using 71X PacBio long reads.…
The pig is a well-studied model animal of biomedical and agricultural importance. Genes of this species, Sus scrofa, are known from experiments and predictions, and collected at the NCBI reference sequence database section. Gene reconstruction from transcribed gene evidence of RNA-seq now can accurately and completely reproduce the biological gene sets of animals and plants. Such a gene set for the pig is reported here, including human orthologs missing from current NCBI and Ensembl reference pig gene sets, additional alternate transcripts, and other improvements. Methodology for accurate and complete gene set reconstruction from RNA is used: the automated SRA2Genes pipeline…
Recombination between loci underlying mate choice and ecological traits is a major evolutionary force acting against speciation with gene flow. The evolution of linkage disequilibrium between such loci is therefore a fundamental step in the origin of species. Here, we show that this process can take place in the absence of physical linkage in hamlets-a group of closely related reef fishes from the wider Caribbean that differ essentially in colour pattern and are reproductively isolated through strong visually-based assortative mating. Using full-genome analysis, we identify four narrow genomic intervals that are consistently differentiated among sympatric species in a backdrop of…
Unlike the normal anadromous lifestyle, Chinese native Dolly Varden char (Salvelinus malma) is locked in land and lives in fresh water lifetime. To explore the effect of freshwater adaption on its immune system, we constructed a pooled cDNA library of hepatopancreas and spleen of Chinese freshwater Dolly Varden char (S. malma). A total of 27,829 unigenes were generated from 31,233 high-quality transcripts and 17,670 complete open reading frames (ORF) were identified. Totally 25,809 unigenes were successfully annotated and it classified more native than adaptive immunity-associated genes, and more genes involved in toll-like receptor signal pathway than those in complement and…
The iconic orange clownfish, Amphiprion percula, is a model organism for studying the ecology and evolution of reef fishes, including patterns of population connectivity, sex change, social organization, habitat selection and adaptation to climate change. Notably, the orange clownfish is the only reef fish for which a complete larval dispersal kernel has been established and was the first fish species for which it was demonstrated that antipredator responses of reef fishes could be impaired by ocean acidification. Despite its importance, molecular resources for this species remain scarce and until now it lacked a reference genome assembly. Here, we present a…
Icefishes (suborder Notothenioidei; family Channichthyidae) are the only vertebrates that lack functional haemoglobin genes and red blood cells. Here, we report a high-quality genome assembly and linkage map for the Antarctic blackfin icefish Chaenocephalus aceratus, highlighting evolved genomic features for its unique physiology. Phylogenomic analysis revealed that Antarctic fish of the teleost suborder Notothenioidei, including icefishes, diverged from the stickleback lineage about 77 million years ago and subsequently evolved cold-adapted phenotypes as the Southern Ocean cooled to sub-zero temperatures. Our results show that genes involved in protection from ice damage, including genes encoding antifreeze glycoprotein and zona pellucida proteins, are…
Hongkong kumquat (Fortunella hindsii) is a wild citrus species characterized by dwarf plant height and early flowering. Here, we identified the monoembryonic F. hindsii (designated as ‘Mini-Citrus’) for the first time and constructed its selfing lines. This germplasm constitutes an ideal model for the genetic and functional genomics studies of citrus, which have been severely hindered by the long juvenility and inherent apomixes of citrus. F. hindsii showed a very short juvenile period (~8 months) and stable monoembryonic phenotype under cultivation. We report the first de novo assembled 373.6 Mb genome sequences (Contig-N50 2.2 Mb and Scaffold-N50 5.2 Mb) for F. hindsii. In total, 32 257 protein-coding genes…
RNA alternative polyadenylation contributes to the complexity of information transfer from genome to phenome, thus amplifying gene function. Here, we report the first X. tropicalis resource with 127,914 alternative polyadenylation (APA) sites derived from embryos and adults. Overall, APA networks play central roles in coordinating the maternal-zygotic transition (MZT) in embryos, sexual dimorphism in adults and longitudinal growth from embryos to adults. APA sites coordinate reprogramming in embryos before the MZT, but developmental events after the MZT due to zygotic genome activation. The APA transcriptomes of young adults are more variable than growing adults and male frog APA transcriptomes are…