X

Quality Statement

Pacific Biosciences is committed to providing high-quality products that meet customer expectations and comply with regulations. We will achieve these goals by adhering to and maintaining an effective quality-management system designed to ensure product quality, performance, and safety.

X

Image Use Agreement

By downloading, copying, or making any use of the images located on this website (“Site”) you acknowledge that you have read and understand, and agree to, the terms of this Image Usage Agreement, as well as the terms provided on the Legal Notices webpage, which together govern your use of the images as provided below. If you do not agree to such terms, do not download, copy or use the images in any way, unless you have written permission signed by an authorized Pacific Biosciences representative.

Subject to the terms of this Agreement and the terms provided on the Legal Notices webpage (to the extent they do not conflict with the terms of this Agreement), you may use the images on the Site solely for (a) editorial use by press and/or industry analysts, (b) in connection with a normal, peer-reviewed, scientific publication, book or presentation, or the like. You may not alter or modify any image, in whole or in part, for any reason. You may not use any image in a manner that misrepresents the associated Pacific Biosciences product, service or technology or any associated characteristics, data, or properties thereof. You also may not use any image in a manner that denotes some representation or warranty (express, implied or statutory) from Pacific Biosciences of the product, service or technology. The rights granted by this Agreement are personal to you and are not transferable by you to another party.

You, and not Pacific Biosciences, are responsible for your use of the images. You acknowledge and agree that any misuse of the images or breach of this Agreement will cause Pacific Biosciences irreparable harm. Pacific Biosciences is either an owner or licensee of the image, and not an agent for the owner. You agree to give Pacific Biosciences a credit line as follows: "Courtesy of Pacific Biosciences of California, Inc., Menlo Park, CA, USA" and also include any other credits or acknowledgments noted by Pacific Biosciences. You must include any copyright notice originally included with the images on all copies.

IMAGES ARE PROVIDED BY Pacific Biosciences ON AN "AS-IS" BASIS. Pacific Biosciences DISCLAIMS ALL REPRESENTATIONS AND WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, NON-INFRINGEMENT, OWNERSHIP, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL Pacific Biosciences BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES OF ANY KIND WHATSOEVER WITH RESPECT TO THE IMAGES.

You agree that Pacific Biosciences may terminate your access to and use of the images located on the PacificBiosciences.com website at any time and without prior notice, if it considers you to have violated any of the terms of this Image Use Agreement. You agree to indemnify, defend and hold harmless Pacific Biosciences, its officers, directors, employees, agents, licensors, suppliers and any third party information providers to the Site from and against all losses, expenses, damages and costs, including reasonable attorneys' fees, resulting from any violation by you of the terms of this Image Use Agreement or Pacific Biosciences' termination of your access to or use of the Site. Termination will not affect Pacific Biosciences' rights or your obligations which accrued before the termination.

I have read and understand, and agree to, the Image Usage Agreement.

I disagree and would like to return to the Pacific Biosciences home page.

Pacific Biosciences
Contact:

When Complete Isn’t Complete: C. Elegans Genome Gets a Makeover

Wednesday, October 2, 2019

Cover artwork by Daisy S. Lim

It was the first multicellular eukaryotic genome sequenced to apparent completion, but it turns out the Caenorhabditis elegans reference that’s been used as a resource for the past 20 years does not exactly correspond with any N2 strain that exists today. 

Assembled using sequence data from N2 and CB1392 populations of uncertain lineage grown in at least two different laboratories during the 1980s and 1990s, accuracy of the C. elegans reference genome is limited both by genetic variants and by the limitations of the technology of the time (clone-based Sanger technology). It is believed the strain may have accumulated up to 1,000 neutral mutations even before it was first frozen in 1969 with substantial genetic differences between strains in different laboratories since then. 

So a team of researchers from Stanford, Cornell, and the University of Tokyo sought to recomplete the genome by performing long-read assembly of VC2010, a modern and easily available nonmutagenized derivative of N2. Not satisfied with the completeness of earlier assembly attempts, the team decided to use three sequencing technologies: Illumina short reads, as well as PacBio and Nanopore long reads.

As described in their cover-gracing Genome Research study, their VC2010 assembly has 99.98% identity to N2, but with an additional 1.8 Mb, including tandem repeat expansions, genome duplications, and more than 53 newfound genes. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. 

“Although we do not expect this or any assembly to be perfect, the VC2010 assembly provides substantial advantages over its predecessors in both precision and completeness,” the authors wrote.

The team assembled raw PacBio reads with the long-read genome assemblers Canu, FALCON, miniasm, and HINGE, yielding complementary assembly gaps from the same input sequencing data. Merging these PacBio assemblies resulted in an assembly containing only five gaps across the genome. Using Nanopore long reads, they were able to close three more, bringing the total gaps down to two. 

The improved assembly “yielded features not visible in the N2 reference assembly, and has several technical and biological implications,” they added. 

They also suggested that more of the nematode genetic record may need to be corrected. They note that almost 2% of the putatively gap-free C. elegans genome proved to be missing from the N2 assembly, including long stretches of repetitive DNA, and said “it seems likely that most of the nematode assemblies generated over the last decade are missing some repetitive regions of genomic DNA.”

Such highly tandemly repeated regions may be crucial for understanding fast-evolving gene families relevant to nematode ecology, and for identifying rapidly evolving virulence factors in parasites such as N. brasiliensis, they added.

“With the possible exception of highly reduced genomes such as Pratylenchus coffeae, long-read assembly will probably be needed to detect and resolve these systematically lost genome sequences,” they wrote.

In order to ensure reproducibility of their new assembly in vivo, the team derived a highly clonal strain from VC2010, called PD1074 (available here), and used it to generate most of the genomic sequence data.

C. elegans researchers who wish to have significantly higher genomic and genetic reproducibility than is possible with N2 are encouraged to adopt PD1074 as a new reference strain for wild-type controls, classical mutagenesis, and genome engineering,” the authors wrote.

Reining in the Wild Strains

Structural variations between the CB4856 and N2 genomes and their effects on chromosomal contents

In a second Genome Research paper, researchers from Seoul National University generated a de novo assembly of CB4856, which is one of the most genetically divergent strains of C. elegans compared to the N2 reference strain.

Their study sought to determine how substantial genomic changes are generated and tolerated within a species, and to compare the wild strain with the N2 reference, as the two have numerous heritable phenotypic differences, including aggregation behavior, mating, nictation behavior, pathogen response and genetic incompatibility. 

Not satisfied that the current, short-read generated CB4856 reference genome accurately represents genomic rearrangements that are longer than the insert length, and concerned that it might be missing insertions and repetitive sequences, the team generated their own PacBio genome assembly to the level of pseudochromosomes containing 76 contigs. 

They identified structural variations that affected as many as 2,694 genes, and found that subtelomeric regions contained the most extensive genomic rearrangements, even creating new subtelomeres in some cases. 

The high variability of subtelomeres over generations facilitates the emergence of new genes and may help to increase the fitness of organisms, the authors note. However, subtelomeres — hypervariable regions adjacent to the telomere — are highly repetitive by nature, which makes genome assembly at their sites very difficult and has hampered the study of their involvement in chromosome evolution.

The subtelomere structure that the Korean team was able to unravel with PacBio sequencing implies that ancestral telomere damage was repaired by alternative lengthening of telomeres, even in the presence of a functional telomerase gene, and that a new subtelomere was formed by break-induced replication, the authors said. 

“Our study demonstrates that substantial genomic changes including structural variations and new subtelomeres can be tolerated within a species, and that these changes may accumulate genetic diversity within a species,” they wrote. 

The researchers said they hoped their CB4856 genome will serve as a better reference genome for wild C. elegans strains, and that the numerous SVs between N2 and CB4856 will help to better understand the effect of SVs on traits by association studies using these strains.

Subscribe for blog updates:

Archives