Menu
July 19, 2019

CGGBP1 mitigates cytosine methylation at repetitive DNA sequences.

CGGBP1 is a repetitive DNA-binding transcription regulator with target sites at CpG-rich sequences such as CGG repeats and Alu-SINEs and L1-LINEs. The role of CGGBP1 as a possible mediator of CpG methylation however remains unknown. At CpG-rich sequences cytosine methylation is a major mechanism of transcriptional repression. Concordantly, gene-rich regions typically carry lower levels of CpG methylation than the repetitive elements. It is well known that at interspersed repeats Alu-SINEs and L1-LINEs high levels of CpG methylation constitute a transcriptional silencing and retrotransposon inactivating mechanism.Here, we have studied genome-wide CpG methylation with or without CGGBP1-depletion. By high throughput sequencing of bisulfite-treated genomic DNA we have identified CGGBP1 to be a negative regulator of CpG methylation at repetitive DNA sequences. In addition, we have studied CpG methylation alterations on Alu and L1 retrotransposons in CGGBP1-depleted cells using a novel bisulfite-treatment and high throughput sequencing approach.The results clearly show that CGGBP1 is a possible bidirectional regulator of CpG methylation at Alus, and acts as a repressor of methylation at L1 retrotransposons.


July 19, 2019

HLA typing for the next generation.

Allele-level resolution data at primary HLA typing is the ideal for most histocompatibility testing laboratories. Many high-throughput molecular HLA typing approaches are unable to determine the phase of observed DNA sequence polymorphisms, leading to ambiguous results. The use of higher resolution methods is often restricted due to cost and time limitations. Here we report on the feasibility of using Pacific Biosciences’ Single Molecule Real-Time (SMRT) DNA sequencing technology for high-resolution and high-throughput HLA typing. Seven DNA samples were typed for HLA-A, -B and -C. The results showed that SMRT DNA sequencing technology was able to generate sequences that spanned entire HLA Class I genes that allowed for accurate allele calling. Eight novel genomic HLA class I sequences were identified, four were novel alleles, three were confirmed as genomic sequence extensions and one corrected an existing genomic reference sequence. This method has the potential to revolutionize the field of HLA typing. The clinical impact of achieving this level of resolution HLA typing data is likely to considerable, particularly in applications such as organ and blood stem cell transplantation where matching donors and recipients for their HLA is of utmost importance.


July 19, 2019

Population structure of mitochondrial genomes in Saccharomyces cerevisiae.

Rigorous study of mitochondrial functions and cell biology in the budding yeast, Saccharomyces cerevisiae has advanced our understanding of mitochondrial genetics. This yeast is now a powerful model for population genetics, owing to large genetic diversity and highly structured populations among wild isolates. Comparative mitochondrial genomic analyses between yeast species have revealed broad evolutionary changes in genome organization and architecture. A fine-scale view of recent evolutionary changes within S. cerevisiae has not been possible due to low numbers of complete mitochondrial sequences.To address challenges of sequencing AT-rich and repetitive mitochondrial DNAs (mtDNAs), we sequenced two divergent S. cerevisiae mtDNAs using a single-molecule sequencing platform (PacBio RS). Using de novo assemblies, we generated highly accurate complete mtDNA sequences. These mtDNA sequences were compared with 98 additional mtDNA sequences gathered from various published collections. Phylogenies based on mitochondrial coding sequences and intron profiles revealed that intraspecific diversity in mitochondrial genomes generally recapitulated the population structure of nuclear genomes. Analysis of intergenic sequence indicated a recent expansion of mobile elements in certain populations. Additionally, our analyses revealed that certain populations lacked introns previously believed conserved throughout the species, as well as the presence of introns never before reported in S. cerevisiae.Our results revealed that the extensive variation in S. cerevisiae mtDNAs is often population specific, thus offering a window into the recent evolutionary processes shaping these genomes. In addition, we offer an effective strategy for sequencing these challenging AT-rich mitochondrial genomes for small scale projects.


July 19, 2019

Multiplexed highly-accurate DNA sequencing of closely-related HIV-1 variants using continuous long reads from single molecule, real-time sequencing.

Single Molecule, Real-Time (SMRT(®)) Sequencing (Pacific Biosciences, Menlo Park, CA, USA) provides the longest continuous DNA sequencing reads currently available. However, the relatively high error rate in the raw read data requires novel analysis methods to deconvolute sequences derived from complex samples. Here, we present a workflow of novel computer algorithms able to reconstruct viral variant genomes present in mixtures with an accuracy of >QV50. This approach relies exclusively on Continuous Long Reads (CLR), which are the raw reads generated during SMRT Sequencing. We successfully implement this workflow for simultaneous sequencing of mixtures containing up to forty different >9 kb HIV-1 full genomes. This was achieved using a single SMRT Cell for each mixture and desktop computing power. This novel approach opens the possibility of solving complex sequencing tasks that currently lack a solution. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 19, 2019

Single-Molecule Real-Time Sequencing combined with optical mapping yields completely finished fungal genome.

Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae. Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio-generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism’s biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes.Studying whole-genome sequences has become an important aspect of biological research. The advent of next-generation sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBio-generated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome analyses to facilitate functional studies into an organism’s biology. Copyright © 2015 Faino et al.


July 19, 2019

Selections that isolate recombinant mitochondrial genomes in animals.

Homologous recombination is widespread and catalyzes evolution. Nonetheless, its existence in animal mitochondrial DNA is questioned. We designed selections for recombination between co-resident mitochondrial genomes in various heteroplasmic Drosophila lines. In four experimental settings, recombinant genomes became the sole or dominant genome in the progeny. Thus, selection uncovers occurrence of homologous recombination in Drosophila mtDNA and documents its functional benefit. Double-strand breaks enhanced recombination in the germ line and revealed somatic recombination. When the recombination partner was a diverged D. melanogaster genome or a genome from a different species such as D. yakuba, sequencing revealed long continuous stretches of exchange. In addition, the distribution of sequence polymorphisms in recombinants allowed us to map a selected trait to a particular region in the Drosophila mitochondrial genome. Thus, recombination can be harnessed to dissect function and evolution of mitochondrial genome.


July 19, 2019

Variable genetic architectures produce virtually identical molecules in bacterial symbionts of fungus-growing ants.

Small molecules produced by Actinobacteria have played a prominent role in both drug discovery and organic chemistry. As part of a larger study of the actinobacterial symbionts of fungus-growing ants, we discovered a small family of three previously unreported piperazic acid-containing cyclic depsipeptides, gerumycins A-C. The gerumycins are slightly smaller versions of dentigerumycin, a cyclic depsipeptide that selectively inhibits a common fungal pathogen, Escovopsis. We had previously identified this molecule from a Pseudonocardia associated with Apterostigma dentigerum, and now we report the molecule from an associate of the more highly derived ant Trachymyrmex cornetzi. The three previously unidentified compounds, gerumycins A-C, have essentially identical structures and were produced by two different symbiotic Pseudonocardia spp. from ants in the genus Apterostigma found in both Panama and Costa Rica. To understand the similarities and differences in the biosynthetic pathways that produced these closely related molecules, the genomes of the three producing Pseudonocardia were sequenced and the biosynthetic gene clusters identified. This analysis revealed that dramatically different biosynthetic architectures, including genomic islands, a plasmid, and the use of spatially separated genetic loci, can lead to molecules with virtually identical core structures. A plausible evolutionary model that unifies these disparate architectures is presented.


July 19, 2019

Highly sensitive, non-invasive detection of colorectal cancer mutations using single molecule, third generation sequencing.

Colorectal cancer (CRC) represents one of the most prevalent and lethal malignant neoplasms and every individual of age 50 and above should undergo regular CRC screening. Currently, the most effective preventive screening procedure to detect adenomatous polyps, the precursors to CRC, is colonoscopy. Since every colorectal cancer starts as a polyp, detecting all polyps and removing them is crucial. By exactly doing that, colonoscopy reduces CRC incidence by 80%, however it is an invasive procedure that might have unpleasant and, in rare occasions, dangerous side effects. Despite numerous efforts over the past two decades, a non-invasive screening method for the general population with detection rates for adenomas and CRC similar to that of colonoscopy has not yet been established. Recent advances in next generation sequencing technologies have yet to be successfully applied to this problem, because the detection of rare mutations has been hindered by the systematic biases due to sequencing context and the base calling quality of NGS. We present the first study that applies the high read accuracy and depth of single molecule, real time, circular consensus sequencing (SMRT-CCS) to the detection of mutations in stool DNA in order to provide a non-invasive, sensitive and accurate test for CRC. In stool DNA isolated from patients diagnosed with adenocarcinoma, we are able to detect mutations at frequencies below 0.5% with no false positives. This approach establishes a foundation for a non-invasive, highly sensitive assay to screen the population for CRC and the early stage adenomas that lead to CRC.


July 19, 2019

SMRT Sequencing for parallel analysis of multiple targets and accurate SNP phasing.

Single-molecule real-time (SMRT) sequencing generates much longer reads than other widely used next-generation (next-gen) sequencing methods, but its application to whole genome/exome analysis has been limited. Here, we describe the use of SMRT sequencing coupled with barcoding to simultaneously analyze one or a small number of genomic targets derived from multiple sources. In the budding yeast system, SMRT sequencing was used to analyze strand-exchange intermediates generated during mitotic recombination and to analyze genetic changes in a forward mutation assay. The general barcoding-SMRT approach was then extended to diffuse large B-cell lymphoma primary tumors and cell lines, where detected changes agreed with prior Illumina exome sequencing. A distinct advantage afforded by SMRT sequencing over other next-gen methods is that it immediately provides the linkage relationships between SNPs in the target segment sequenced. The strength of our approach for mutation/recombination studies (as well as linkage identification) derives from its inherent computational simplicity coupled with a lack of reliance on sophisticated statistical analyses. Copyright © 2015 Guo et al.


July 19, 2019

Single molecule real-time sequencing of Xanthomonas oryzae genomes reveals a dynamic structure and complex TAL (transcription activator-like) effector gene relationships.

Pathogen-injected, direct transcriptional activators of host genes, TAL (transcription activator-like) effectors play determinative roles in plant diseases caused by Xanthomonas spp. A large domain of nearly identical, 33-35 aa repeats in each protein mediates DNA recognition. This modularity makes TAL effectors customizable and thus important also in biotechnology. However, the repeats render TAL effector (tal) genes nearly impossible to assemble using next-generation, short reads. Here, we demonstrate that long-read, single molecule real-time (SMRT) sequencing solves this problem. Taking an ensemble approach to first generate local, tal gene contigs, we correctly assembled de novo the genomes of two strains of the rice pathogen X. oryzae completed previously using the Sanger method and even identified errors in those references. Sequencing two more strains revealed a dynamic genome structure and a striking plasticity in tal gene content. Our results pave the way for population-level studies to inform resistance breeding, improve biotechnology and probe TAL effector evolution.


July 19, 2019

Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum.

Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16?kilobases) reads with random errors, we assembled 99% (244?megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4?megabases. Oropetium is an example of a ‘near-complete’ draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.


July 19, 2019

Genome analysis of the fruiting body forming myxobacterium Chondromyces crocatus reveals high potential for natural product biosynthesis.

Here we report the first complete genome sequence of the type strain of the myxobacterial genus Chondromyces – Chondromyces crocatus Cm c5. It presents one of the largest prokaryotic genomes featuring a single circular chromosome and no plasmids. Analysis revealed an enlarged set of tRNA genes, along with reduced pressure on preferred codon usage compared to other bacterial genomes. The large coding capacity and the plethora of encoded secondary metabolite biosynthetic gene clusters is in line with the capability of Cm c5 to produce an arsenal of anti-bacterial, anti-fungal and cytotoxic compounds. Known pathways of the ajudazol, chondramide, chondrochloren, crocacin, crocapeptin and thuggacin compound families are complemented by many more natural compound biosynthetic gene clusters in the chromosome. Whole-genome comparison of the fruiting-body forming type-strain (Cm c5 = DSM 14714) to an accustomed laboratory strain which has lost this ability (Cm c5 fr-) revealed genetic changes in three loci. In addition to the low synteny found with the closest sequenced representative of the same family, Sorangium cellulosum, extensive genetic information duplication, and broad application of eukaryotic-type signal transduction systems are hallmarks of this 11.3 Mbp prokaryotic genome. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


July 19, 2019

A method for near full-length amplification and sequencing for six hepatitis C virus genotypes.

Hepatitis C virus (HCV) is a rapidly evolving RNA virus that has been classified into seven genotypes. All HCV genotypes cause chronic hepatitis, which ultimately leads to liver diseases such as cirrhosis. The genotypes are unevenly distributed across the globe, with genotypes 1 and 3 being the most prevalent. Until recently, molecular epidemiological studies of HCV evolution within the host and at the population level have been limited to the analyses of partial viral genome segments, as it has been technically challenging to amplify and sequence the full-length of the 9.6 kb HCV genome. Although recent improvements have been made in full genome sequencing methodologies, these protocols are still either limited to a specific genotype or cost-inefficient.In this study we describe a genotype-specific protocol for the amplification and sequencing of the near-full length genome of all six major HCV genotypes. We applied this protocol to 122 HCV positive clinical samples, and had a successful genome amplification rate of 90 %, when the viral load was greater than 15,000 IU/ml. The assay was shown to have a detection limit of 1-3 cDNA copies per reaction. The method was tested with both Illumina and PacBio single molecule, real-time (SMRT) sequencing technologies. Illumina sequencing resulted in deep coverage and allowed detection of rare variants as well as HCV co-infection with multiple genotypes. The application of the method with PacBio RS resulted in sequence reads greater than 9 kb that covered the near full-length HCV amplicon in a single read and enabled analysis of the near full-length quasispecies.The protocol described herein can be utilised for rapid amplification and sequencing of the near-full length HCV genome in a cost efficient manner suitable for a wide range of applications.


July 19, 2019

DNA methylation on N(6)-adenine in mammalian embryonic stem cells.

It has been widely accepted that 5-methylcytosine is the only form of DNA methylation in mammalian genomes. Here we identify N(6)-methyladenine as another form of DNA modification in mouse embryonic stem cells. Alkbh1 encodes a demethylase for N(6)-methyladenine. An increase of N(6)-methyladenine levels in Alkbh1-deficient cells leads to transcriptional silencing. N(6)-methyladenine deposition is inversely correlated with the evolutionary age of LINE-1 transposons; its deposition is strongly enriched at young (<1.5 million years old) but not old (>6 million years old) L1 elements. The deposition of N(6)-methyladenine correlates with epigenetic silencing of such LINE-1 transposons, together with their neighbouring enhancers and genes, thereby resisting the gene activation signals during embryonic stem cell differentiation. As young full-length LINE-1 transposons are strongly enriched on the X chromosome, genes located on the X chromosome are also silenced. Thus, N(6)-methyladenine developed a new role in epigenetic silencing in mammalian evolution distinct from its role in gene activation in other organisms. Our results demonstrate that N(6)-methyladenine constitutes a crucial component of the epigenetic regulation repertoire in mammalian genomes.


July 19, 2019

Towards better precision medicine: PacBio single-molecule long reads resolve the interpretation of HIV drug resistant mutation profiles at explicit quasispecies (haplotype) level.

Development of HIV-1 drug resistance mutations (HDRMs) is one of the major reasons for the clinical failure of antiretroviral therapy. Treatment success rates can be improved by applying personalized anti-HIV regimens based on a patient’s HDRM profile. However, the sensitivity and specificity of the HDRM profile is limited by the methods used for detection. Sanger-based sequencing technology has traditionally been used for determining HDRM profiles at the single nucleotide variant (SNV) level, but with a sensitivity of only = 20% in the HIV population of a patient. Next Generation Sequencing (NGS) technologies offer greater detection sensitivity (~ 1%) and larger scope (hundreds of samples per run). However, NGS technologies produce reads that are too short to enable the detection of the physical linkages of individual SNVs across the haplotype of each HIV strain present. In this article, we demonstrate that the single-molecule long reads generated using the Third Generation Sequencer (TGS), PacBio RS II, along with the appropriate bioinformatics analysis method, can resolve the HDRM profile at a more advanced quasispecies level. The case studies on patients’ HIV samples showed that the quasispecies view produced using the PacBio method offered greater detection sensitivity and was more comprehensive for understanding HDRM situations, which is complement to both Sanger and NGS technologies. In conclusion, the PacBio method, providing a promising new quasispecies level of HDRM profiling, may effect an important change in the field of HIV drug resistance research.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.