This blog features voices from PacBio — and our partners and colleagues — discussing the latest research, publications, and updates about SMRT Sequencing. Check back regularly or sign up to have our blog posts delivered directly to your inbox.
Search PacBio’s Blog
If you are like most of us at PacBio you likely learned how to extract DNA in a high school or college biology class, or maybe even in your kitchen. But as you moved on to more high stakes experiments, you may have found that extracting DNA for sequencing in your lab isn’t always as straightforward as lyse, precipitate, wash, suspend. In this introduction to DNA extraction, we will share tips, tricks, and protocols to help make your DNA isolation easier!
For optimal results to power biological discovery, sample prep is a critical step in any sequencing project. And with long-read sequencing technologies, including HiFi sequencing, you not only want DNA free from nicks and degradation, but you also want long fragments (tens of kilobases) to achieve those coveted long reads.
Long-read sequencing expert and sample wrangler Olga Pettersson (@OlgaVPettersson) of SciLifeLab at Uppsala University, advises: “Aim for getting molecules as long as you can, as pure as you can, as fresh as you can.”
So, what are the factors that go into obtaining HMW DNA for sequencing? Jennifer Balacco (@JenBalacco) of the Vertebrate Genome Lab, which aims to sequence the genomes of all living vertebrate species and therefore has a dearth of experience with DNA extraction, points to sample type, the prep and storage of samples, and individualized extraction methods as the key components of successful DNA extraction. You can follow along as she shares her experience with many sample types in the video below and then explore our additional resources and considerations.
Watch this PacBio Virtual Global Summit presentation from Jennifer Balacco of the Vertebrate Genome Lab on DNA extraction approaches to achieve error-free genomes.
DNA Extraction Challenge #1: Sample Type
Both within the same organisms and between species there is variability in how readily HMW DNA can be extracted and how stable it is once extracted. DNA from liver, for example, is known to quickly degrade due to the enzymes that make a functioning liver, while DNA extracted from blood is typically more stable. Some plant species have phenolics and polysaccharides that interfere with extraction, and mollusks have high DNAase activity that makes it difficult to store DNA for any amount of time.
If you have a choice on the type of sample you use, a cell-dense tissue with minimal potential contaminants is your best bet. For vertebrates, this means using tissues like blood, brain, kidney, or muscle. For some invertebrates there may be a mucous membrane that inhibits the ability to obtain high-quality DNA, and you might want to consider an additional DNA cleanup step to rid the extraction of contaminants.
When working with small arthropods you can use an adult individual but may find that targeting pupae or larva are an easier DNA source than a tough exoskeleton-covered adult. When planning a fungus sequencing project, consider culturing the sample in order to acquire a single isolate/individual in the case of macroscopic organisms or an isogenic population in the case of microorganisms. And finally, for plants, it is recommended to obtain the youngest leaf/shoot tissue from an individual plant that has been dark treated (kept out of light) for 24-72 hours.
DNA Extraction Challenge #2: Sample Prep and Storage
The second consideration for HMW DNA extraction after you’ve decided what sample type to use is how you will treat that sample. Sequencing adheres to the “garbage in, garbage out” rule, therefore it’s prudent to take care when prepping your samples. In most cases, the freshest sample will work best, followed by samples flash frozen with liquid nitrogen and stored at -80°C. This is because as soon as a tissue is taken from its living organism it begins to release factors that degrade both DNA and RNA, making it a race against the clock to get the genetic material out intact.
“Aim for getting molecules as long as you can, as pure as you can, as fresh as you can.”
– Olga Pettersson, SciLifeLab at Uppsala University
Of course, we can’t always control how a sample is prepped or stored, and in those cases it’s generally worth a try to get the best DNA you can from any given sample. There are examples of ethanol stored samples providing sufficient quality DNA as well as museum specimens for amplicon sequencing. However you decide to prep and store your sample prior to DNA extraction, the main aim is to reduce the amount of time between sampling and stably storing your sample to reduce enzymatic degradation of the genetic material within.
DNA Extraction Challenge #3: Choosing the Right Method
The final piece of the puzzle when it comes to obtaining HMW DNA for a sequencing project is the method used for extraction. There is no shortage of kits, protocols, and tutorials for DNA extraction, and after spending years trying to find the best one-size-fits-all extraction method for various sample types, we are fairly confident one doesn’t exist! However, there are some approaches that consistently produce plentiful HMW DNA that can be binned by sample type.
In general, “old school” methods using chemicals commonly found in molecular biology labs perform fairly well. For example, phenol and chloroform extractions work well for many tissues, though the chemicals used are dangerous. The cetyl trimethylammonium bromide (CTAB) method for extraction of DNA from plants is also a fairly robust way to yield good DNA. And once you understand the chemistry of how DNA is liberated from cells via these methods, you can tailor the protocols to meet the needs of individual species.
If you’re in the market for a tailored protocol, we encourage you to check out Extract DNA for PacBio, where we have collected many protocols from published projects, organized by organism type. However, if you’re looking for an easy, all-in-one DNA extraction kit to get you started on your sequencing journey, there’s a few out there that have produced great DNA for HiFi sequencing, and are summarized in our DNA extraction technical note. If you are hoping to outsource this step to a DNA extraction lab, explore our Certified Service Providers, many of which offer DNA extraction as a service.
While there might not be a one-size-fits-all solution for extracting DNA, we hope our experience and those of our customers can help point you in the right direction for a successful HiFi sequencing project!
If you are ready to get started with sequencing or simply need help with choosing the best DNA extraction approach, connect with a PacBio Scientist.
Explore Other Posts in the Sequencing 101 Series:
If only we could track COVID-19 like we track the weather, with satellites and weather stations placed around the globe monitoring and sounding the alarm about potential storms, floods, droughts and other severe weather events.
A global pathogen surveillance network would save countless lives, and lessons learned from the current coronavirus pandemic could help make it possible, PacBio Chief Scientific Officer Jonas Korlach told Mendelspod host Theral Timpson (@theraltweet).
Korlach joined Brian Caveney, President and Chief Medical Officer of Labcorp, in a recent podcast to discuss SARS-CoV-2 viral surveillance and the trajectory of COVID research, vaccination and treatment.
PacBio has partnered with the national diagnostic testing company to support its large-scale SARS-CoV-2 testing, which has become part of the US Centers for Disease Control’s COVID19 genomic surveillance effort. Labcorp has sequenced thousands of samples from around the country on its fleet of Sequel II Systems, and worked closely with PacBio to develop a new HiFiViral SARS-CoV-2 Workflow protocol to enable any laboratory to rapidly and efficiently power viral mutation surveillance using PacBio’s HiFi sequencing.
While rapid COVID19 diagnostic testing is generally being done via PCR methods, there is still an important role for viral sequencing, Korlach said. PCR tests provide very limited information about genomic mutations and might not be able to identify which variant of the virus a person is infected with. HiFi sequencing on the PacBio systems can provide a highly detailed profile of the 30,000 base-pair long SARS-CoV-2 virus, including specific mutations and whether there are multiple subtypes of the virus in individual patients, which has been detected.
Labcorp is using both methods, Caveney said. The company has performed more than 38 million COVID-19 PCR tests, and sequenced more than 20,000 genomes with PacBio technology. Not only is the whole-genome sequencing useful in accelerating scientists’ understanding of the virus as it evolves, but it has helped Labcorp ensure its PCR tests continue to be sensitive to emerging variants and mutations.
“Our research and development team loves working with the PacBio equipment. They like the incredible ability to have high specificity with the long reads that we’re getting,” Caveney said. “It’s going to continue to be a very important research tool for both sides of the house — the diagnostic side, as well as the clinical research side — to make sure that the best medications, therapeutics and vaccines are coming to market.”
Another benefit of sequencing technology in the realm of infectious diseases is that it is a “universal measuring device,” Korlach said. Whether the pathogen is a virus or bacterium a DNA sequencer can detect either.
“It’s COVID today and the variants of COVID tomorrow, but what about all the other infectious disease agents that for many decades have cost millions of lives?” Korlach said. “We now have opportunities to tackle them a lot better than we have in the past, using the COVID pandemic as a blueprint.”
Caveney agreed. “We’re so focused on COVID, but 30, 40, 50,000 Americans die every year from influenza. And we now have learning from COVID that might help us bring that number down in the future. That would be a great win, in spite of the tragedy we just went through.”
So what would it take to create a global pan-pathogen surveillance network?
Collaboration, between and among scientific communities, public health agencies, and private companies, Caveney said. An international standardization of nomenclature is also high on his wishlist, “so that regardless of the instruments or the technology used to do the sequencing, it results in information that can be compared and assimilated in a way that all scientists and doctors know what to do with it.”
Continued investment, Korlach said. “Are we willing to keep investing and focusing on making that change permanent and applying it to other infectious diseases, of really building out a permanent and stable network where the routine medical care is going to shift from measuring a temperature and looking in your mouth to getting samples genomically tested within days? That is a future that I think is possible, and that I would like to be part of trying to do our little part to make that happen.”
To learn more about genome surveillance and the benefits of PacBio sequencing, explore our COVID-19 sequencing tools and resources
Today we’re pleased to announce the launch of a new HiFi Sequencing workflow along with a software update for the Sequel II and Sequel IIe Systems that will increase the number of HiFi reads at or above 99.9% accuracy (QV30) for whole genome sequencing-based applications. Together, these advances will improve the quality of HiFi Sequencing while providing an efficient and scalable workflow for sequencing hundreds to thousands of whole human genomes per year on Sequel Systems.
This high-throughput sequencing and analysis workflow release includes a new HiFi library prep protocol offering a three-fold reduction in DNA input, enabling HiFi sequencing with limited sample quantities (neonatal blood, tissue biopsies, and cell lines).
Developed in collaboration with Children’s Mercy Kansas City, the release supports the adoption of HiFi reads for comprehensive variant detection to better understand the genetic causes of rare and inherited diseases. In a statement announcing the release, Emily Farrow, Director of Lab Operations at Children’s Mercy Research Institute, said: “This new workflow provides efficiency in our lab where now two research scientists can comfortably produce one thousand HiFi libraries a year, with the hope of doubling the throughput for library prep by automated liquid handling currently tested in the laboratory.”
The release also features new enabling workflows for variant calling and analysis of the SARS-CoV-2 genome in combination with the recently released high-throughput COVID sequencing protocol developed in partnership with Labcorp.
Jasmine Pritchard, our Vice President of Product Marketing, said, “We see building enthusiasm in the market for HiFi sequencing and this new release demonstrates our commitment to continuously improving our already industry-leading accuracy and key aspects of the workflow. Our team is focused on delivering advancements across the full spectrum of our portfolio, from sample preparation to downstream analysis.”
The HiFi Sequencing and Software v10.1 Release is available to order today and includes the following features:
- New Consumables: SMRTbell Enzyme Clean Up Kit 2.0, Sequel II Primer v5, Polymerase Binding Kit 2.2
- HiFi Protocol: Updated HiFi Express protocol enabling reduced DNA input
- Sequel II ICS v10.1: On-instrument workflow improvements that simplify run set up, especially for multiplexed applications
- SMRT Link v10.1: Updates for Adaptive Loading, our new HiFiViral for SARS-CoV-2 analysis application, and improved Iso-Seq Analysis for multiplexed samples
We also invite you to watch our on-demand Rare Disease Week event to hear how scientists are using HiFi sequencing to help identify causative variants and increase solve rates in rare disease research.
Ready to get started with HiFi sequencing? Connect with a PacBio scientist for a free project or instrument consultation.
By Jonas Korlach, Chief Scientific Officer
Grapy dusks over tangerine fields. Potato-patch fog over beds of coral. Mountains, glaciers, forests, deserts, fertile farmland and seas with both Arctic and tropical biomes.
One of the most geographically and biologically diverse states, California is home to both the highest (Mount Whitney) and lowest (Death Valley) points in the 48 contiguous states, as well as to some of the world’s most exceptional trees — the tallest (coast redwood), most massive (Giant Sequoia), and oldest (bristlecone pine).
At PacBio, we are extremely fortunate to have this biodiversity in our back yard — almost literally. We didn’t have to travel far to take samples of the giant California redwood as part of a personal project to sequence its gigantic genome and transcriptome.
It’s one of the reasons we are excited to work with the California Conservation Genomics Project, a collaboration of scientists across the state that has selected more than 100 threatened, endangered or otherwise valuable species sampled from the full array of California ecosystems for HiFi sequencing and assembly.
The purpose of the $10 million state-funded project is to capture the genetic variation that exists across each species’ habitat, with the ultimate objective of informing smarter development and more effective conservation.
How can genetics inform conservation? More biodiversity means more resilient ecosystems, and conservationists have long focused on preserving habitats and studying the roles of species within ecosystems. But they are now recognizing the importance genetic variation can play on long-term survival of a species.
Populations with high genetic diversity are more likely to contain individuals with a genetic makeup that allows them to survive new environmental pressures. Populations with low genetic diversity might not even survive the next big threat, so it is crucial to identify individuals with genetic variation in order to conserve the species’ ability to survive and evolve.
Threats to one population can threaten others, including ours. A collapsing ecosystem affects all those species who rely on it. So preserving biodiversity is also an exercise in self-preservation.
California will not be the only ecosystem to benefit from the CCGP research. In many ways, the state is a microcosm of what’s happening to biodiversity around the world. It faces threats similar to those faced by habitats on other continents: climate change, wildfires, droughts, and an ever-expanding population that encroaches onto formerly wild lands.
Its efforts will be boosted by other international initiatives, such as the United Kingdom’s Darwin Tree of Life Project, Australia’s Oz Mammals Genomics Initiative, the Vertebrate Genomes Project and The Earth BioGenome Project, whose ambitious goal is to sequence the DNA of 1.5 million species by 2030. We’re proud that PacBio technology is being used in all of these projects. You can learn more about the biodiversity initiatives PacBio sequencing is supporting in my recent presentation at the Senckenberg Biodiversity Genomics Symposium.
While COVID-19 has focused the attention of the scientific community — including our own — on pathogen detection, surveillance and drug development, lockdown has also spurred a renewed appreciation of nature. How many of us have sought solace in a temple of trees — in some cases, amongst towering columns of sequoias older than the Parthenon?
On this Earth Day, I urge all of you to do your part to “Restore our Earth,” whether that be committing to a home conservation project, or supporting an international one. At PacBio, we will be participating in public awareness campaigns and contributing our time and expertise in support of these important biodiversity initiatives. Let’s make every day Earth Day.
When size matters and you need to be able to detect both single nucleotide changes as well as large repeated sequences, SMRT Sequencing on the Sequel II System is the way to go, concluded rare disease researchers at Centre de Recherche en Myologie at Sorbonne Université/INSERM
Stéphanie Tomé (@TomeStephanie) and colleagues used the highly sensitive, comprehensive long-read sequencing to investigate myotonic dystrophy type 1 (DM1), the most complex and variable trinucleotide repeat disorder, caused by an unstable CTG repeat expansion that can reach up to 4,000 triplets in those affected most severely with the disease.
As reported previously, the length of these repeated CTG sections and any interruptions in the sequences have been found to correlate with the severity and onset of symptoms of the neuromuscular autosomal disorder, which is the most common form of inherited muscular dystrophy in adults.
The highly variable clinical presentation of DM1 and current limitations in methods to determine the size and variant repeat interruptions of the large CTG repeat expansions, make genetic counseling for the condition very complex, so Tomé turned to PacBio sequencing to better understand this mutation. She successfully applied for a 2019 Targeted Sequencing SMRT Grant, and the results of her work were recently published in the International Journal of Molecular Sciences.
“Better characterization of expanded alleles in DM1 patients can significantly improve prognosis and genetic counseling, not only in DM1 but also for other tandem DNA repeat disorders,” Tomé said.
Inherited CTG repeat expansion size and the level of somatic mosaicism are traditionally evaluated by Southern blot and polymerase chain reaction (PCR), which do not provide any information on the sequence of CTG repeat expansion. Triplet-primed PCR testing may detect the presence of interruptions at the 5’ and 3’ ends of the CTG repeat expansion, and short-read sequencing can help identify further interruptions, but the methods give no information about the middle of the sequence.
Using the Sequel II System, Tomé’s team was able to sequence 1,000 CTG triplet-long repeats, detect a single CAG and multiple CCG interruptions, and also estimate somatic mosaicism (the occurrence of two genetically distinct populations of cells within an individual derived from a postzygotic mutation) within two DM1 families—with more accuracy than conventional PCR.
The data enabled them to gain insights into the genetic changes within the families in the study, as well as some observations applicable to the nature of DM1. They revealed the existence of de novo CCG interruptions associated with CTG stabilization/contraction across generations in one of the families. And the heterogeneity of the number and type of interruptions observed in the interrupted expanded alleles suggested new mechanisms leading to base substitution in the sequence and/or duplication of existing interruptions in the repeated sequence. These could be caused by multiple processes, including spontaneous DNA damage, DNA repair and DNA polymerase errors occurring in germ cells and somatic cells throughout embryogenesis and the lifetime of those affected by DM1.
“Our study reinforced the idea that interrupted alleles do not originate from an ancestral/normal allele, but from unknown mechanisms occurring both in the germline and in somatic cells,” the study concluded.
“SMRT Sequencing opens new avenues for DM1 disease and will provide a better understanding of the clinical and genetic variability observed in DM1 through global analysis,” Tomé added. “This new technology is a straightforward way to detect clinically significant repeat changes and estimate the size of the repeat in blood using targeted sequencing.”
To learn more about how scientists are using highly accurate long-read sequencing in large-scale studies to help identify causative variants, increase solve rates in rare disease research, and support the development of diagnostics for rare and undiagnosed diseases, watch on-demand presentations from PacBio Neuroscience Day, and register for the Rare Disease Week virtual event, April 27-29.
Resolve Complex Human Genetic Variation with Confidence
Apply by June 11, 2021 for your chance to win free sequencing.
Vaccine safety is of the utmost importance. Respiratory syncytial virus (RSV) is the most common cause of severe lower respiratory tract illness in infants and young children. Much like the flu, it can also cause severe disease in the elderly or immunocompromised adults, making it an important target for vaccine development. In a recently published study, researchers used PacBio long-read sequencing to evaluate the genetic stability of a live-attenuated RSV vaccine candidate and observed previously unknown adaptation mechanisms that was missed by short-read sequencing.
Codon-pair deoptimization involves recoding of open reading frames (ORFs) to reduce protein expression and is used as a mechanism for creating live-attenuated vaccine candidates. It is important, however, to understand whether deoptimized viruses could accumulate mutations under selective pressure that might lead to de-attenuation.
To this end, researchers of the Laboratory of Infectious Diseases at the National Institute of Allergy and Infectious Diseases used a combination of sequencing technologies to examine codon-pair deoptimization in human RSV under selective pressure.
In their PNAS paper “Rescue of codon-pair deoptimized respiratory syncytial virus by the emergence of genomes with very large internal deletions that complemented replication,” Cyril Le Nouën et al. tested the genetic stability of a live-RSV vaccine candidate that had been attenuated by codon-pair deoptimization of its glycoprotein genes. As one of the hallmarks of attenuation, the replication of this RSV strain was reduced at higher temperatures. This feature was used to test stability of the attenuation: the virus was passaged in cell culture over several months at continuously increasing temperatures. After each passage, the entire virus genome (about 15 kb) was sequenced to identify mutations.
The researchers discovered that, while the RSV strain accumulated point mutations, they had minimal effect on viral replication. Unexpectedly, however, using a combination of long-range PCR and PacBio HiFi sequencing, the scientists identified large deletions that appeared early in the serial passage and became the dominant species. These large-deletion genomes rescued RSV glycoprotein expression, thereby restoring replication of the deoptimized virus. They hypothesized that these large-deletion genomes occurred through polymerase jumping.
“Under selective pressure, Large Deletion (LD) genomes were selected to restore rather than to inhibit the replication of a single-stranded RNA virus, attenuated by [codon-pair deoptimization] of two ORFs,” the authors report. Such a mechanism of compensation was previously unknown for RNA viruses and suggests that the accumulation of DI (defective interfering) genomes has to be carefully investigated during the generation and evaluation of live-attenuated vaccine candidates.
With growing interest in viral sequencing and vaccine development, understanding how viruses adapt under selective pressure is essential. PacBio’s long-read sequencing played an important role in this study by identifying large deletions that would have otherwise been missed had only short-read methods been used. As first author Le Nouën notes, “Long-range deep sequencing is a useful method to understand the virus population dynamics and specifically how mutations co-evolve over time on a viral genome.”
Today we’re pleased to announce the three winners of our latest SMRT Grant which called for teams of researchers and collaborative projects that could be addressed using the power of HiFi sequencing. The winners are seeking to solve a diverse set of questions from mussel-hopping transmissible cancer to the power of pistachios to help tackle climate change, and sex determination in bearded dragons.
The 2020 HiFi for All – Collaborations SMRT Grant Program was open to scientists worldwide and offered three winning projects awards of up to 10 SMRT Cells 8M and sequencing on the Sequel II or IIe System by one of our service providers and co-sponsors. We received many truly compelling proposals, featuring teams from across the life sciences and beyond, and selecting three winners was quite a challenge. Here is a glimpse into how the winning teams will use HiFi sequencing to advance their science.
Cancer or Infectious Disease? Unblurring the Line in Bivalves
Cancers may evolve as they mutate and divide, but they almost always lead to an evolutionary dead end with the death or remission of their hosts. However, in a handful of cases, cancer cells have been shown to spread beyond their original hosts. In these transmissible cancers, cells themselves jump from individual to individual, spreading through the environment.
An inter-continental team of biologists and geneticists, led by Metzger, will investigate this type of cancer in marine mussels, where molecular analysis has shown that a bivalve transmissible neoplasia (BTN) that arose from a single Mytilus trossulus individual has now been found infecting four different Mytilus species around the world.
Collaborators Nicolas Bierne (CNRS – University of Montpellier, France) will provide samples from M. edulis in Europe; Petr Strelkov and Maria Skazina (St. Petersburg State University, Russia) and Nelly Odintsova (Far Eastern Branch of the Russian Academy of Sciences) will provide M. trossulus samples from their studies in the Sea of Japan; Artur Burzynski (IO PAN, Poland) will aid in analysis of a sample from Northern Europe; and Gloria Arriagada (Universidad Andrés Bello, Chile) will provide samples of M. chilensis from South America.
Together, this team has assembled the most widespread marine transmissible cancer lineage known and aims to use HiFi sequencing to detect and phase somatic variants to understand the selective pressures underlying this fatal and unexplained phenomenon.
“We are really excited about what the HiFi data and this collaboration will allow us to see. This single lineage of transmissible cancer that began in a single animal has spread into populations of marine mussels around the world, and by working together, we will be able to untangle the genetic changes that have shaped its evolution.”
– Michael Metzger, Pacific Northwest Research Institute
Sequencing for this project will be provided by the PacBio Certified Service Provider University of Louisville’s Department of Biochemistry and Molecular Genetics.
Tackling a Challenging Genome That Could Help Address Climate Change
Winner: Esaú Martínez, CIAG-IRIAF, Spain
As a highly nutritious crop adapted to arid conditions, pistachio has become popular for crop replacement in regions affected by climate change. However, even this resilient species is starting to suffer the effects of warmer winters, causing a lack of chilling accumulation which affect bud dormancy breaking, bud burst and flowering.
In order to make the most of the crop’s tolerance to drought and warm climates, a team of scientists from across Europe and the United States are working to find resilience mechanisms by sequencing the large genetic variation of cultivated pistachio.
Led by Martínez, the team will create a pangenome collection of six highly heterozygous cultivars, with extensive haplotype diversity that can be exploited for breeding. The team hopes the pangenomes will serve as a critical starting point toward the long-term goal of using HiFi sequencing to characterize genomic diversity in global pistachio collections. Additionally, the team will generate a pantranscriptome using PacBio full-length RNA sequencing (Iso-Seq) method to precisely annotate the pangenome and identify transcript variants related to climate adaptation.
Collaborators include Adela Mena (IVICAM-IRIAF, Spain) Antonio Giovino and Luigi Cattivelli (@luigicattivelli, Council for Agricultural Research and Economics, Italy); Annalisa Marchese and Francesco Paolo Marra (University of Palermo, Italy); and Pablo Carbonell (@pcarbonellb, Max Planck Institute for Developmental Biology, Germany). Grey Monroe (@grey_monroe, University of California-Davis) will also take part to identify functional variants contributing to production under warmer climates.
“We are extremely delighted for the opportunity to work together with PacBio in this project. Pistachio is a highly nutritious crop adapted to arid conditions. However, pistachio farmers are already starting to experience the negative effects of climate change. We believe HiFi is the perfect technology for the development of functional genomic resources for pistachio breeding. Combining pangenome and pantranscriptome approaches we will identify functional variants enabling the sustainability of the crop under warmer conditions.”
– Esaú Martínez, CIAG-IRIAF
Sequencing for this project will be provided by the PacBio Certified Service Provider GENTYANE part of the Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE).
Unraveling the Mechanisms of Sex Determination in Reptiles
In vertebrates, sex is determined either by genetic factors on sex chromosomes (genetic sex determination; GSD) or by environmental cues, such as egg incubation temperature (temperature-dependent sex determination; TSD).
In reptiles, both modes are common, but the mechanisms underpinning reptile GSD and TSD remain mysterious. High repeat content and close homology between the sex chromosomes in reptiles have thwarted previous attempts at their assembly and phasing to identify GSD candidates, and the gene/s controlling TSD may act at any level in the complex regulatory cascade governing sexual differentiation, potentially implicating a multitude of genes across all chromosomes.
A team of researchers from across Australia, led by Deveson will attempt to unravel the mystery by sequencing the genome and transcriptome of the bearded dragon lizard, Pogona vitticeps, a unique model organism in which sex is shaped by both genotype and temperature. Dragons have a GSD system wherein chrZW embryos develop as female and chrZZ as male. However, at high egg-incubation temperatures, this is overridden so that both chrZZ and chrZW embryos develop as female.
Additional RNA sequencing on adult tissues and embryonic developmental stages will allow the team to generate high-quality, allele-specific transcriptome annotations and uncover the primary transcriptional signatures of both GSD and TSD.
Collaborators include: Arthur Georges and Sarah Whiteley (University of Canberra); Hardip Patel and Yu Lin (Australian National University); Parwinder Kaur (UWA, Australia) and Andre Reis (Garvan Institute).
“The daunting task of complete assembly and phasing of reptilian ZW sex chromosomes requires long reads with very high per-base accuracy. I believe that PacBio HiFi sequencing is the only technology that can deliver this.”
– Ira Deveson, Garvan Institute
Sequencing for this project will be provided by Nucleome Informatics.
Congratulations to all our HiFi for All – Collaborations SMRT Grant winners! And thank you to our co-sponsors for teaming up with PacBio to make these SMRT Grants possible. Explore the 2021 SMRT Grant Programs to apply to have your project funded and learn more about HiFi sequencing.
It’s been a year since we took a little field trip to Stanford to collect samples from the giant California redwood (Sequoia sempervirens) with the goal of assembling its ginormous 27 Gb genome.
What would have been considered a herculean effort not that many years ago was accomplished in only a few weeks by a handful of personnel —Emily Hatas (@EmilyHatas), Greg Young (@PacbioGreg), Michelle Vierra (@the_mvierra), and Greg Concepcion (@phototrophic) — in their spare time.
As detailed in this blog post, the crew put together an assembly with 22-fold coverage in just 17 days — 4 days of sample prep, 7 days of sequencing, and 6 days for assembly. With a little more time on their hands, and enough HiFi library to go around, the team embarked on more sequencing to create an even better assembly, with 33-fold coverage, a contig N50 of 3.8 Mb.
Wanting to take things even further, Iso-Seq analysis expert Elizabeth Tseng (@magdoll) delved deep into sequences of transcripts from the redwood’s needles.
The results from two Sequel II SMRT Cells (a total of 5.3 million full-length reads) were mapped to the hifiasm v12 assembly of the PacBio redwood genome, yielding 336,853 high-quality Iso-Seq transcripts, with 69,198 mapped loci and 205,792 unique, full-length transcripts.
The mapped transcripts ranged from 50 bp to 14.2 kb with a mean length of 2.9 kb. While most of the loci had 1–5 isoforms, there were many that displayed complex alternative splicing patterns, highlighting the power of full-length transcript sequencing.
“I found several aspects of the Iso-Seq data exciting,” Tseng noted in a Medium post about the work. “One was the ability to see alternative splicing. Another was the ability to predict ORFs directly from the sequences.”
The exercise was also a good test of the IsoPhase isoform phasing method that Tseng initially developed for maize, a diploid genome. Would it work for a hexaploid genome?
By combining phased genome information with phased transcriptome data, Tseng was able to identify five distinct alleles, as well as genes that were likely to be homologous.
Lastly, Tseng used the Iso-Seq data — and another tool, Cogent — to assess the quality of the redwood genome assembly.
“The high mappability of the Iso-Seq data to the PacBio genome has shown that the genome assembly is quite complete in terms of coding regions,” she said. “Missing genes or difficult-to-assemble gene regions can be assessed using Iso-Seq transcripts.”
Want To See More Redwood Iso-Seq Analysis? Dig In! We’ve released the Iso-Seq dataset, including the transcript sequences, GFF files, BLASTN hits, IsoPhase and Cogent results. We welcome the community to use this dataset for research, tool development, and give us feedback.
Data for the redwood genome has also been made publicly available along with the updated genome assembly and can be found here.
As new strains of the SARS-CoV-2 virus emerge—including variants that appear to make the virus more contagious or potentially more likely to effect the efficacy of the new vaccines—it’s clear that continued genomic surveillance will be essential as we try to rein in the COVID-19 pandemic.
Fortunately, scientists have been using all the tools at their disposal, including SMRT Sequencing platforms from PacBio. In a talk presented at a virtual American Society for Microbiology event in December 2020, Labcorp’s Michael Levandoski spoke about using the Sequel II System for mapping COVID-19 outbreaks by location and over time.
Watch Levandoski’s full presentation:
As a reference lab taking samples from all over the U.S., Labcorp has unique insight into the genetic path of the virus in the world’s largest outbreak. “The samples we collect represent circulating virus in the population,” Levandoski noted, adding that the company has tested more than 3.4 million positive samples as of February 2021. For each sample collected, metadata about timing, location, and patient demographics are recorded to create a truly valuable data set. The company’s large-scale SARS-CoV-2 sequencing project was built on a Sequel II System workflow, analyzing remnants of samples that have tested positive with diagnostic assays.
Of course, that means Labcorp scientists are working with low-input samples, for which they use two pools of overlapping 1.2 kb amplicons to sequence the whole viral genome at a pace of 600 to 1,000 genomes per SMRT Cell using HiFi reads. Levandoski’s team has sequenced more than 17,000 genomes as of February 26, 2021 and reports that HiFi assemblies offer very high resolution without missing any regions, so scientists can identify new mutations with confidence. “We’re able to detect new mutations and variants in the population as they appear,” he said. The team has nearly 20,000 archived samples collected from before March 15th that are in the pipeline for sequencing, and new samples will continue to be sequenced for pathogen surveillance purposes.
With such a small genome, though, is long-read sequencing really necessary? Levandoski thinks so: he told conference attendees that short-read amplicon sequencing could miss new mutations that are key to monitoring transmission and the outcome of vaccinations. By ensuring that all mutations are detected with highly accurate HiFi reads, his team can check each new genetic change to determine whether it’s in a clinically relevant region of the genome.
As a result of this ongoing surveillance work, Labcorp was awarded a sequencing surveillance contract with the US Centers for Disease Control as a part of their efforts to track and learn more about SARS-CoV-2 as it evolves and spreads throughout the country.
Download the Labcorp protocol and learn more about the benefits of using PacBio sequencing for SARS-CoV-2 surveillance.
Sunday is Rare Disease Day – a time to honor the patients, families, caregivers, and healthcare professionals who are part of the rare disease community.
At PacBio, we are passionate about supporting this community and providing tools that help improve the ability of scientists and clinicians to deliver valuable answers to families and reduce what can be a years-long diagnostic odyssey. And while each ‘rare’ disease may affect a limited number of people, collectively these diseases affect hundreds of millions of people around the world.
Since we last celebrated this special day, we’ve been particularly excited by the progress made by cutting-edge scientists and clinicians who are applying new technologies to find the genetic root causes of these diseases. Leading into Rare Disease Day, we’d like to highlight and acknowledge the work of these scientists who are striving to improve the lives of those affected by rare diseases.
In Missouri, the team at Children’s Mercy Kansas City recently announced the opening of a massive new pediatric research facility housing the Children’s Mercy Research Institute (CMRI). The institute, established in 2015 to accelerate precise diagnoses and treatments for complex childhood diseases, is built on a translational approach that brings science and medicine together seamlessly.
One of the institute’s most important research projects is Genomic Answers for Kids (GA4K), a first-of-its-kind pediatric data repository that is collecting genomic data and health information from 30,000 children and their families during the next seven years to create a database of 100,000 genomes. More than 2,230 families with rare disease have enrolled in the program to-date, which has resulted in more than 10,200 new genomic analyses, more than 250 genetic diagnoses and already contributed to the reporting of 10 new disease genes.
GA4K focuses on rare diseases and has been solving previously unsolvable cases by implementing highly accurate long-read sequencing, known as HiFi sequencing. Based on early successes, the team has scaled up its capacity with additional Sequel IIe Systems and aims to use HiFi whole genome sequencing for approximately 1,000 cases that went unsolved after the preliminary short-read exome analysis.
Meanwhile, in Alabama, scientists at the HudsonAlpha Institute for Biotechnology recently announced that they found likely pathogenic variants in two pediatric rare disease cases that had remained unsolved using short-read sequencing. In both cases, the patients suffered from neurodevelopmental disorders. The scientists were able to pinpoint the disease-causing genetic variants through whole genome sequencing of parent-proband trios. One of the pathogenic variants was a 7 kb insertion in the CDKL5 gene, while in the other instance an extensive structural variation was highlighted. Both variant types are known to be challenging for short-read sequencing technologies and were therefore not discovered in the preliminary analysis.
“The ability to find so many variants that were previously missed is exciting, and holds great promise for diagnostic testing in the future,” says HudsonAlpha Faculty Investigator Greg Cooper, PhD. “Long-read genome sequencing will become a powerful tool for research and clinical testing over the next few years.”
One of the earliest examples of how PacBio sequencing technology could make a difference for rare disease cases came from the Stanford lab of Euan Ashley, a noted cardiologist who just released a new book, The Genome Odyssey: Medical Mysteries and the Incredible Quest to Solve Them. The book includes, among many others, a fascinating case of Carney complex in an individual who had suffered a series of tumors in his heart and glands, for whom eight years of genetic analyses had produced no firm answers.
These are just a few of the many great advancements among rare disease experts that are making new inroads into tough cases with HiFi sequencing. It is critical to remember that each of these explained cases represents a family that is now closer to the end of their diagnostic odyssey, potential treatment options, and renewed hope for healthier futures. We send our sincere gratitude to them and everyone working hard to accelerate the development of medical advancements in rare disease research.
If you’d like to participate in this wonderful community, take a look at these upcoming events in support of rare disease research awareness, funding and education:
- Rare Disease Day strives to raise awareness amongst the public and decision-makers about rare diseases and their impact on patients’ lives – to show your support, you can use your social channels to amplify and tag #RareDiseaseDay
- Collaborate with and support Children’s Mercy’s Genomics Answers for Kids program, by nominating a patient for participation and/or donating to support their vision
- The HudsonAlpha team is hosting the Double Helix Dash in April, a virtual 5K to support childhood genetic disorders research – anyone can participate!
- PacBio is hosting a 3-day virtual event in April focused on the genetics of rare disease – register to attend and hear firsthand from scientists and clinicians on their recent discoveries
To learn more about how PacBio HiFi sequencing is helping advance our understanding of rare disease, visit our rare disease resource page.
What does the ideal genome assembly look like? High-quality, free of errors, with no gaps, and all haplotypes resolved.
It’s a big ask, especially with challenging genomes like plants that are rich in repetitive content with high levels of heterozygosity and complex polyploidy. Moreover, such assemblies often require a combination of technologies, such as sequencing plus optical mapping.
But a team of scientists at the King Abdullah University of Science and Technology (KAUST) Core Labs (@kaust_corelabs), proved it is possible by using one technology — PacBio HiFi Sequencing — in just seven days.
Their recent preprint introduced LeafGo, a streamlined workflow able to produce a high-quality draft plant genome from plant tissue without using additional scaffolding technologies.
The rapid, one-pass approach was tested on two different Eucalyptus species, E. rudis, and E. camaldulensis.
There are more than 800 eucalypt species, but only three genomes have been published: E. grandis, E. pauciflora and E. camaldulensis. The LeafGo produced high-quality draft E. camaldulensis genome is an improvement upon those highly fragmented genomes, the KAUST team wrote.
Their assembly of E. rudis, a close relative of E. camaldulensis that inhabits a different ecological niche, is the first for that species.
“The two genomes sequenced here will improve our genomic knowledge of eucalypts, which at the moment is relatively sparse, and will assist with conservation issues and commercial uses,” they wrote.
The team tested both continuous long read (CLR) and HiFi circular consensus sequencing (CCS) data, and were especially impressed with the results from HiFi reads — “the higher base-level accuracy given by HiFi improves the assembly considerably, thus removing the need for polishing with short-read sequencing.”
“HiFi assemblies demanded less computational requirements, had higher BUSCO scores, showed several fold improvement of contig N50/N90 and L50/L90, and generated more complete genome assemblies,” the authors wrote.
“In fact, our HiFi sequencing data, assembled with hifiasm, produced near-chromosome level haploid draft genomes,” they added.
“One of the main advantages for our chosen genome assembly workflow, using hifiasm with HiFi reads, are the savings in time and compute requirements, all with minimal manual intervention.”
The estimated total time from raw reads to HiFi data to the assembly of a high-quality contiguous draft for a haploid genome of 0.6 to 1.0 Gb is approximately one day, they wrote. Assembling the HiFi data using hifiasm took 80 minutes for E. rudis (23x coverage) and 120 minutes for E. camaldulensis (27x coverage).
“When combined with time estimates of HMW DNA extraction (one day), HiFi library preparation and sequencing (five days) and assembly; a high-quality draft genome can be prepared from plant samples in seven days, depending on available compute resources,” the authors stated.
The team also created a modified Qiagen Genomic protocol in order to tackle the challenge of extracting high molecular weight DNA from the Eucalyptus species, which is difficult due to their high phenolic and polysaccharide content.
“Our extraction protocol generated high pure and copious amounts of HMW DNA within a day and using minimal resources and effort,” they wrote.
The authors say they hope LeafGo will be a valuable tool for global initiatives to sequence and assemble genomes for many thousands of eukaryotic life forms that do not yet have published standardized workflows.
Genome assembly statistics for two Eucalyptus species
This blog post has been updated, it was originally published September 2016.
In recent interactions with the scientific community, we’ve seen a growing number of questions around scaffolding genome assemblies. We thought it might be useful to review the concepts behind contigs and scaffolds, as well as the circumstances in which one might want to scaffold a high-quality PacBio genome assembly.
Contigs vs. Scaffolds
Contigs are continuous stretches of sequence containing only A, C, G, or T bases without gaps. SMRT Sequencing has all of the necessary performance characteristics – long reads, lack of sequence-context bias, and high accuracy – to generate contiguous genome assemblies with megabase-sized contigs. Ultra-long contigs provide complete and uninterrupted sequence information across full genes, and more recently even allow separation of the different chromosomes for diploid and polyploid organisms.
The unprecedented quality of PacBio highly accurate long reads – known as HiFi reads – has been described as “the most effective standalone technology for de novo assembly” in a study focused on sequencing the CHM13 human cell line, which yielded an assembly contig N50 of 29.5 Mb and a Phred quality score of Q45. HiFi reads have also enabled generating reference-quality de novo assemblies of many plant and animal species, population-specific human assemblies and the first fully complete sequence of a human autosome – chromosome 8, including the centromeres. Even large and complex plant genomes like the California Redwood, a 27 Gb hexaploid, can be readily assembled with high contiguity using HiFi reads.
Learn how HiFi reads help scientists unlock new discoveries.
Scaffolds are created by chaining contigs together using additional information about the relative position and orientation of the contigs in the genome. Contigs in a scaffold are separated by gaps, which are designated by a variable number of ‘N’ letters. Scaffolding is often used for short-read assemblies to make sense of the fragmented genome assemblies containing short contigs. However, there are three important principal deficiencies of scaffolds:
- Scaffolds miss critical information. Gaps represent missing genomic information and, in many cases, these gaps can coincide with important genomic loci. Many promoters and first exons are GC-rich in sequence, often resulting in missing or low-quality sequence reads from short-read or Sanger sequencing. Thus, genes are incompletely resolved, and their regulation cannot be understood. Another reason for gaps in scaffolded assemblies is large, repetitive elements which short-read sequencing methods struggle to bridge. Thus, duplicated genes, genes vs. pseudogenes, short tandem repeats, variable number tandem repeats, microsatellites, and many other structural genomic features are often unresolved in scaffolded short read assemblies. As summarized in a Nature Genetic Reviews article, long-read sequencing technologies, and specifically HiFi reads help overcome these types of complex regions to give a complete picture of genetic variation, including in regions previously thought to be intractable like telomeres and centromeres.
- The length of a scaffold gap often has no relation to the true gap size. In several reference genomes, gaps are arbitrarily set to certain fixed lengths. For example, most gaps in the zebra finch reference are set to 100 Ns, while in the version 3 maize reference they are set to 1,000 Ns. This means that in most cases, the true length of sequence represented by the gap differs from the set gap size, and is sometimes off by thousands of bases. The uncertainties of gap sizes in scaffolds result in an inability to understand the true spatial relationships of functional elements in genomes and is an underestimate of the actual extent of missing information. More recently, those older reference assemblies have benefited from PacBio long-read sequencing – see the latest: zebra finch and maize.
- Gap-flanking scaffold sequence can be low-quality, and is sometimes completely wrong. The sequences surrounding gaps often fall into areas where short-read technologies have deficiencies due to GC-bias or read-length limitations. This can result in sequence that is of lower quality and, in some cases, completely erroneous. For example, because of complex repeat structures in the human IGH locus, the right edge of a 50,000 N gap in the short-read assembly contains 1,836 bases of flanking sequence that has no support in the hg19 human genome reference or the PacBio assembly. In some ways, having incorrect flanking sequence in scaffolds is worse than having ‘N’ gaps, since that erroneous sequence is considered and included for downstream analyses.
Illustration of the difference between contigs and scaffolds in genome assemblies
The information missed by gapped scaffold assemblies complicates and may preclude downstream analysis and understanding related to functional and comparative genomics. Scaffolded short-read assemblies get nowhere near the quality of PacBio genome assemblies in terms of contiguity and completeness, and they often require labor-intensive follow-up work to close gaps, adding time and cost to projects.
Scaffolding PacBio assemblies for chromosome-scale genome representations
For even longer-range genomic connectivity, e.g. to bridge the largest segmental duplications and repeat regions, researchers can go a step further by adding scaffolding information to a PacBio assembly, often resulting in telomere-to-telomere, chromosome-scale genome representations. Several methods have been demonstrated to work very well for this purpose, including optical mapping and crosslinking approaches. Check out examples of barn swallow, insects, and human genome sequencing to see how chromosome-level scaffolding enables more comprehensive insights.
There are numerous large international initiatives using PacBio long-read sequencing to produce high-quality, phased, chromosome-level genome assemblies of many organisms:
- Vertebrate Genomes Project
- Sanger 25 Genomes Project
- Darwin Tree of Life Project
- NHGRI Human Pangenome Reference Initiative
- PacBio Workshop: Understanding the biology of genomes with HiFi sequencing
- Webinar: Sequencing 101 – How long-read sequencing improves access to genetic information
- Understanding Accuracy in DNA Sequencing
- Looking Beyond the Single Reference Genome to a Pangenome for Every Species
- The Evolution of DNA Sequencing Tools
They may not be as well known as our chimpanzee or gorilla cousins, but macaques have played many key roles in scientific progress over the last half century. From launching into orbit during the early days of space travel to revealing the genetics of neurodevelopmental disorders and infectious diseases today, the rhesus macaque remains a key research primate around the world.
A new, comprehensively annotated reference genome unveiled last month boosts the potential of the most widely used non-human primate in biomedical research even further, with new insight into gene functions and disease susceptibility.
A large team of researchers — led by Wesley C. Warren of the University of Missouri; Evan E. Eichler of the University of Washington; and Jeffrey Rogers of Baylor College of Medicine — has released an updated rhesus macaque (Macaca mulatta) reference genome that increases the sequence contiguity 120-fold over previous assemblies.
They also used PacBio full-length isoform sequencing, the Iso-Seq method, to analyze 6.5 million full-length transcripts and create a comprehensive set of protein-coding and non-coding gene models. This provided vital information about gene content, organization and isoform diversity, and led to the identification of new macaque isoforms and gene candidates.
“With the improved assembly of segmental duplications, we discovered new lineage-specific genes and expanded gene families that are potentially informative in studies of evolution and disease susceptibility,” the authors wrote.
In addition to the new reference, the team gathered whole genome sequencing data from 850 rhesus macaques from U.S. research colonies and three wild-caught Chinese samples. Similar to the human 1000 Genomes project, they wanted to catalog genetic variation within the species, and ended up identifying 85.7 million single-nucleotide variants (SNVs) and 10.5 million indels — the most extensive collection of segregating genetic variants for any non-human primate species.
The researchers found that an average research macaque carried 9.7 million SNVs — more than twice as diverse per individual as humans — and they now plan on using the information to better understand aspects of genome function, such as gene regulation, and to generate new genetic models of disease.
By studying what these nucleotide changes might do to genes implicated in autism spectrum disorders and predicting likely gene disruption (LGD) variants, for example, the macaques may offer new clues to the often heterogeneous genetic condition.
These naturally occurring mutations could provide an opportunity to develop noninvasive models of human disease without the expense of CRISPR engineering of embryos, they added, which would be particularly useful in relation to phenotypes that are not readily reproduced in non-primate knockout models and for evaluating the effect of genetic variation on the efficacy of treatments before human trials.
“This new macaque reference genome and the genetic characterization of research populations will substantially advance biomedical research and studies of primate genome evolution by providing an improved framework for more complete studies of genetic variation and its phenotypic consequence,” the authors concluded.
Scientists have long struggled to explain the success of Mycobacterium tuberculosis in the face of effective therapeutics. Tuberculosis (TB) kills more than 1 million people annually, and has a remarkable ability to develop resistance to drugs despite its stable genome. But now, a new study from researchers at San Diego State University and other institutions strongly suggests that methylation rather than genome sequence gives M. tuberculosis its broad phenotypic range.
Lead author Samuel Modlin (@sam_modlin), senior author Faramarz Valafar (@FaramarzValafar), and collaborators report using SMRT Sequencing technology to characterize the DNA adenine methylomes of 93 clinical TB isolates. They chose samples representing diverse phylogenetic and geographic sources, and focused on methylation because of previous, smaller studies suggesting the importance of methyltransferases in M. tuberculosis. They aimed to delve deeper than those studies to see if whole methylome data could answer lingering questions about the pathogen.
“It is unclear how such a genetically static organism adapts so rapidly to drug treatment and varied immune pressures,” the scientists note. “DNA methylation is a plausible yet scarcely explored alternative mechanism for phenotypic variation in M. tuberculosis.”
In addition to producing highly accurate genetic data, SMRT Sequencing also measures epigenetic activity through kinetic changes as the DNA molecule is sequenced. The use of PacBio’s long-read technology proved critical: the long-read data enabled de novo assembly, unlike short reads that must be used with a reference-based variant calling approach. The new assemblies of the 93 isolates revealed an insertion, missed by previous studies, that is associated with an inactive methyltransferase.
The team deployed several techniques to analyze the samples comprehensively. They produced complete, de novo genome assemblies for all isolates with SMRT Sequencing — identifying all mutations in the three methyltransferases present — and also used the kinetic data to assess methyltransferase motif sites. Phylogenetic analysis allowed them to identify epigenomic diversity across seven lineages. Finally, the team used existing transcriptomic data sets to layer onto the methylome information for a deeper analysis.
Perhaps most interestingly, the scientists used an analysis pipeline to analyze SMRT Sequencing kinetic data from each individual read, rather than in bulk. The results indicate a phenomenon the team refers to as intercellular mosaic methylation, or IMM, in which methylation is not strictly turned on or off but rather affects a subset of motif sites that vary from one cell to another.
“Mutation-driven IMM was nearly ubiquitous in the globally prominent Beijing sublineage,” they report. They also identified more than 350 hypervariable sites across the isolates where there appeared to be no consistency in methylation patterns. All told, they add, the results represent “the largest survey of methylomic diversity in [TB pathogens] to date.”
“This multi-omic integration revealed features of methylomic variability in clinical isolates and provides a rational basis for hypothesizing the functions of DNA adenine methylation in [M. tuberculosis] physiology and adaptive evolution,” the authors conclude.
“These findings add to the growing body of literature demonstrating bacterial epigenomics is an important complementary focus to genetic and phenotypic analysis in studying microbial diversity, gene regulation, and evolution.”
Learn more about methylation detection using PacBio sequencing for your research.
The power of PacBio HiFi reads has enabled transformative research into human disease. A new collaboration with Invitae, a leader in medical genetics, is intended to help harness the technology for use in mainstream medicine.
The ability of HiFi reads to detect genetic variants, even in hard-to-sequence regions of the genome, has already shown clinical utility. In a recent research collaboration with Invitae, announced in October 2020, the comprehensive, highly accurate reads were used to explore clinically relevant molecular targets for use in the development of advanced diagnostic testing for epilepsy.
We are thrilled to announce a new collaboration with Invitae to develop an ultra-high-throughput clinical whole genome sequencing platform. Read more about it here.
Learn more about the benefits and workflows of PacBio whole genome sequencing