June 1, 2021  |  

Getting the most out of your PacBio libraries with size selection.

PacBio RS II sequencing chemistries provide read lengths beyond 20 kb with high consensus accuracy. The long read lengths of P4-C2 chemistry and demonstrated consensus accuracy of 99.999% are ideal for applications such as de novo assembly, targeted sequencing and isoform sequencing. The recently launched P5-C3 chemistry generates even longer reads with N50 often >10,000 bp, making it the best choice for scaffolding and spanning structural rearrangements. With these chemistry advances, PacBio’s read length performance is now primarily determined by the SMRTbell library itself. Size selection of a high-quality, sheared 20 kb library using the BluePippin™ System has been demonstrated to increase the N50 read length by as much as 5 kb with C3 chemistry. BluePippin size selection or a more stringent AMPure® PB selection cutoff can be used to recover long fragments from degraded genomic material. The selection of chemistries, P4-C2 versus P5-C3, is highly dependent on the final size distribution of the SMRTbell library and experimental goals. PacBio’s long read lengths also allow for the sequencing of full-length cDNA libraries at single-molecule resolution. However, longer transcripts are difficult to detect due to lower abundance, amplification bias, and preferential loading of smaller SMRTbell constructs. Without size selection, most sequenced transcripts are 1-1.5 kb. Size selection dramatically increases the number of transcripts >1.5 kb, and is essential for >3 kb transcripts.


June 1, 2021  |  

SMRT Sequencing solutions for plant genomes and transcriptomes

Single Molecule, Real-Time (SMRT) Sequencing provides efficient, streamlined solutions to address new frontiers in plant genomes and transcriptomes. Inherent challenges presented by highly repetitive, low-complexity regions and duplication events are directly addressed with multi- kilobase read lengths exceeding 8.5 kb on average, with many exceeding 20 kb. Differentiating between transcript isoforms that are difficult to resolve with short-read technologies is also now possible. We present solutions available for both reference genome and transcriptome research that best leverage long reads in several plant projects including algae, Arabidopsis, rice, and spinach using only the PacBio platform. Benefits for these applications are further realized with consistent use of size-selection of input sample using the BluePippin™ device from Sage Science. We will share highlights from our genome projects using the latest P5- C3 chemistry to generate high-quality reference genomes with the highest contiguity, contig N50 exceeding 1 Mb, and average base quality of QV50. Additionally, the value of long, intact reads to provide a no-assembly approach to investigate transcript isoforms using our Iso-Seq protocol will be presented for full transcriptome characterization and targeted surveys of genes with complex structures. PacBio provides the most comprehensive assembly with annotation when combining offerings for both genome and transcriptome research efforts. For more focused investigation, PacBio also offers researchers opportunities to easily investigate and survey genes with complex structures.


September 22, 2019  |  

Revertant mosaicism repairs skin lesions in a patient with keratitis-ichthyosis-deafness syndrome by second-site mutations in connexin 26.

Revertant mosaicism (RM) is a naturally occurring phenomenon where the pathogenic effect of a germline mutation is corrected by a second somatic event. Development of healthy-looking skin due to RM has been observed in patients with various inherited skin disorders, but not in connexin-related disease. We aimed to clarify the underlying molecular mechanisms of suspected RM in the skin of a patient with keratitis-ichthyosis-deafness (KID) syndrome. The patient was diagnosed with KID syndrome due to characteristic skin lesions, hearing deficiency and keratitis. Investigation of GJB2 encoding connexin (Cx) 26 revealed heterozygosity for the recurrent de novo germline mutation, c.148G?>?A, p.Asp50Asn. At age 20, the patient developed spots of healthy-looking skin that grew in size and number within widespread erythrokeratodermic lesions. Ultra-deep sequencing of two healthy-looking skin biopsies identified five somatic nonsynonymous mutations, independently present in cis with the p.Asp50Asn mutation. Functional studies of Cx26 in HeLa cells revealed co-expression of Cx26-Asp50Asn and wild-type Cx26 in gap junction channel plaques. However, Cx26-Asp50Asn with the second-site mutations identified in the patient displayed no formation of gap junction channel plaques. We argue that the second-site mutations independently inhibit Cx26-Asp50Asn expression in gap junction channels, reverting the dominant negative effect of the p.Asp50Asn mutation. To our knowledge, this is the first time RM has been reported to result in the development of healthy-looking skin in a patient with KID syndrome. © The Author 2017. Published by Oxford University Press.


September 22, 2019  |  

Next generation sequencing technology: Advances and applications.

Impressive progress has been made in the field of Next Generation Sequencing (NGS). Through advancements in the fields of molecular biology and technical engineering, parallelization of the sequencing reaction has profoundly increased the total number of produced sequence reads per run. Current sequencing platforms allow for a previously unprecedented view into complex mixtures of RNA and DNA samples. NGS is currently evolving into a molecular microscope finding its way into virtually every fields of biomedical research. In this chapter we review the technical background of the different commercially available NGS platforms with respect to template generation and the sequencing reaction and take a small step towards what the upcoming NGS technologies will bring. We close with an overview of different implementations of NGS into biomedical research. This article is part of a Special Issue entitled: From Genome to Function. Copyright © 2014 Elsevier B.V. All rights reserved.


September 22, 2019  |  

MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs.

There are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads. Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce increasing numbers of longer reads (of varying lengths in the range of 10k-1M bps, say), but of low quality. There is an increasing interest in using long reads in microbiome sequencing, and there is a need to adapt short read tools to long read datasets.We describe a new LCA-based algorithm for taxonomic binning, and an interval-tree based algorithm for functional binning, that are explicitly designed for long reads and assembled contigs. We provide a new interactive tool for investigating the alignment of long reads against reference sequences. For taxonomic and functional binning, we propose to use LAST to compare long reads against the NCBI-nr protein reference database so as to obtain frame-shift aware alignments, and then to process the results using our new methods.All presented methods are implemented in the open source edition of MEGAN, and we refer to this new extension as MEGAN-LR (MEGAN long read). We evaluate the LAST+MEGAN-LR approach in a simulation study, and on a number of mock community datasets consisting of Nanopore reads, PacBio reads and assembled PacBio reads. We also illustrate the practical application on a Nanopore dataset that we sequenced from an anammox bio-rector community.This article was reviewed by Nicola Segata together with Moreno Zolfo, Pete James Lockhart and Serghei Mangul.This work extends the applicability of the widely-used metagenomic analysis software MEGAN to long reads. Our study suggests that the presented LAST+MEGAN-LR pipeline is sufficiently fast and accurate.


September 22, 2019  |  

A whole genome assembly of the horn fly, Haematobia irritans, and prediction of genes with roles in metabolism and sex determination.

Haematobia irritans, commonly known as the horn fly, is a globally distributed blood-feeding pest of cattle that is responsible for significant economic losses to cattle producers. Chemical insecticides are the primary means for controlling this pest but problems with insecticide resistance have become common in the horn fly. To provide a foundation for identification of genomic loci for insecticide resistance and for discovery of new control technology, we report the sequencing, assembly, and annotation of the horn fly genome. The assembled genome is 1.14 Gb, comprising 76,616 scaffolds with N50 scaffold length of 23 Kb. Using RNA-Seq data, we have predicted 34,413 gene models of which 19,185 have been assigned functional annotations. Comparative genomics analysis with the Dipteran flies Musca domestica L., Drosophila melanogaster, and Lucilia cuprina, show that the horn fly is most closely related to M. domestica, sharing 8,748 orthologous clusters followed by D. melanogaster and L. cuprina, sharing 7,582 and 7,490 orthologous clusters respectively. We also identified a gene locus for the sodium channel protein in which mutations have been previously reported that confers target site resistance to the most common class of pesticides used in fly control. Additionally, we identified 276 genomic loci encoding members of metabolic enzyme gene families such as cytochrome P450s, esterases and glutathione S-transferases, and several genes orthologous to sex determination pathway genes in other Dipteran species. Copyright © 2018 Konganti et al.


September 22, 2019  |  

The African Bullfrog (Pyxicephalus adspersus) genome unites the two ancestral ingredients for making vertebrate sex chromosomes

Heteromorphic sex chromosomes have evolved repeatedly among vertebrate lineages despite largely deleterious reductions in gene dose. Understanding how this gene dose problem is overcome is hampered by the lack of genomic information at the base of tetrapods and comparisons across the evolutionary history of vertebrates. To address this problem, we produced a chromosome-level genome assembly for the African Bullfrog (Pyxicephalus adspersus)–an amphibian with heteromorphic ZW sex chromosomes–and discovered that the Bullfrog Z is surprisingly homologous to substantial portions of the human X. Using this new reference genome, we identified ancestral synteny among the sex chromosomes of major vertebrate lineages, showing that non-mammalian sex chromosomes are strongly associated with a single vertebrate ancestral chromosome, while mammals are associated with another that displays increased haploinsufficiency. The sex chromosomes of the African Bullfrog however, share genomic blocks with both humans and non-mammalian vertebrates, connecting the two ancestral chromosome sequences that repeatedly characterize vertebrate sex chromosomes. Our results highlight the consistency of sex-linked sequences despite sex determination system lability and reveal the repeated use of two major genomic sequence blocks during vertebrate sex chromosome evolution.


September 22, 2019  |  

De novo genome assembly of Oryza granulata reveals rapid genome expansion and adaptive evolution

The wild relatives of rice have adapted to different ecological environments and constitute a useful reservoir of agronomic traits for genetic improvement. Here we present the ~777?Mb de novo assembled genome sequence of Oryza granulata. Recent bursts of long-terminal repeat retrotransposons, especially RIRE2, led to a rapid twofold increase in genome size after O. granulata speciation. Universal centromeric tandem repeats are absent within its centromeres, while gypsy-type LTRs constitute the main centromere-specific repetitive elements. A total of 40,116 protein-coding genes were predicted in O. granulata, which is close to that of Oryza sativa. Both the copy number and function of genes involved in photosynthesis and energy production have undergone positive selection during the evolution of O. granulata, which might have facilitated its adaptation to the low light habitats. Together, our findings reveal the rapid genome expansion, distinctive centromere organization, and adaptive evolution of O. granulata.


September 22, 2019  |  

Natural selection in bats with historical exposure to white-nose syndrome

Hibernation allows animals to survive periods of resource scarcity by reducing their energy expenditure through decreased metabolism. However, hibernators become susceptible to psychrophilic pathogens if they cannot mount an efficient immune response to infection. While Nearctic bats infected with white-nose syndrome (WNS) suffer high mortality, related Palearctic taxa are better able to survive the disease than their Nearctic counterparts. We hypothesised that WNS exerted historical selective pressure in Palearctic bats, resulting in genomic changes that promote infection tolerance.


July 19, 2019  |  

Microsatellite marker discovery using single molecule real-time circular consensus sequencing on the Pacific Biosciences RS.

Microsatellite sequences are important markers for population genetics studies. In the past, the development of adequate microsatellite primers has been cumbersome. However with the advent of next-generation sequencing technologies, marker identification in genomes of non-model species has been greatly simplified. Here we describe microsatellite discovery on a Pacific Biosciences single molecule real-time sequencer. For the Greater White-fronted Goose (Anser albifrons), we identified 316 microsatellite loci in a single genome shotgun sequencing experiment. We found that the capability of handling large insert sizes and high quality circular consensus sequences provides an advantage over short read technologies for primer design. Combined with a straightforward amplification-free library preparation, PacBio sequencing is an economically viable alternative for microsatellite discovery and subsequent PCR primer design.


July 19, 2019  |  

PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations.

Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high.We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki-Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants.The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.


July 19, 2019  |  

CGGBP1 mitigates cytosine methylation at repetitive DNA sequences.

CGGBP1 is a repetitive DNA-binding transcription regulator with target sites at CpG-rich sequences such as CGG repeats and Alu-SINEs and L1-LINEs. The role of CGGBP1 as a possible mediator of CpG methylation however remains unknown. At CpG-rich sequences cytosine methylation is a major mechanism of transcriptional repression. Concordantly, gene-rich regions typically carry lower levels of CpG methylation than the repetitive elements. It is well known that at interspersed repeats Alu-SINEs and L1-LINEs high levels of CpG methylation constitute a transcriptional silencing and retrotransposon inactivating mechanism.Here, we have studied genome-wide CpG methylation with or without CGGBP1-depletion. By high throughput sequencing of bisulfite-treated genomic DNA we have identified CGGBP1 to be a negative regulator of CpG methylation at repetitive DNA sequences. In addition, we have studied CpG methylation alterations on Alu and L1 retrotransposons in CGGBP1-depleted cells using a novel bisulfite-treatment and high throughput sequencing approach.The results clearly show that CGGBP1 is a possible bidirectional regulator of CpG methylation at Alus, and acts as a repressor of methylation at L1 retrotransposons.


July 19, 2019  |  

Identification of a common risk haplotype for canine idiopathic epilepsy in the ADAM23 gene.

Idiopathic epilepsy is a common neurological disease in human and domestic dogs but relatively few risk genes have been identified to date. The seizure characteristics, including focal and generalised seizures, are similar between the two species, with gene discovery facilitated by the reduced genetic heterogeneity of purebred dogs. We have recently identified a risk locus for idiopathic epilepsy in the Belgian Shepherd breed on a 4.4 megabase region on CFA37.We have expanded a previous study replicating the association with a combined analysis of 157 cases and 179 controls in three additional breeds: Schipperke, Finnish Spitz and Beagle (pc?=?2.9e-07, pGWAS?=?1.74E-02). A targeted resequencing of the 4.4 megabase region in twelve Belgian Shepherd cases and twelve controls with opposite haplotypes identified 37 case-specific variants within the ADAM23 gene. Twenty-seven variants were validated in 285 cases and 355 controls from four breeds, resulting in a strong replication of the ADAM23 locus (praw?=?2.76e-15) and the identification of a common 28 kb-risk haplotype in all four breeds. Risk haplotype was present in frequencies of 0.49-0.7 in the breeds, suggesting that ADAM23 is a low penetrance risk gene for canine epilepsy.These results implicate ADAM23 in common canine idiopathic epilepsy, although the causative variant remains yet to be identified. ADAM23 plays a role in synaptic transmission and interacts with known epilepsy genes, LGI1 and LGI2, and should be considered as a candidate gene for human epilepsies.


July 19, 2019  |  

Single-Molecule Real-Time Sequencing combined with optical mapping yields completely finished fungal genome.

Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae. Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio-generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism’s biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes.Studying whole-genome sequences has become an important aspect of biological research. The advent of next-generation sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBio-generated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome analyses to facilitate functional studies into an organism’s biology. Copyright © 2015 Faino et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.