Menu
July 19, 2019

PacBio SMRT assembly of a complex multi-replicon genome reveals chlorocatechol degradative operon in a region of genome plasticity.

We have sequenced a Burkholderia genome that contains multiple replicons and large repetitive elements that would make it inherently difficult to assemble by short read sequencing technologies. We illustrate how the integrated long read correction algorithms implemented through the PacBio Single Molecule Real-Time (SMRT) sequencing technology successfully provided a de novo assembly that is a reasonable estimate of both the gene content and genome organization without making any further modifications. This assembly is comparable to related organisms assembled by more labour intensive methods. Our assembled genome revealed regions of genome plasticity for further investigation, one of which harbours a chlorocatechol degradative operon highly homologous to those previously identified on globally ubiquitous plasmids. In an ideal world, this assembly would still require experimental validation to confirm gene order and copy number of repeated elements. However, we submit that particularly in instances where a polished genome is not the primary goal of the sequencing project, PacBio SMRT sequencing provides a financially viable option for generating a biologically relevant genome estimate that can be utilized by other researchers for comparative studies. Copyright © 2016. Published by Elsevier B.V.


July 19, 2019

Polymerase specific error rates and profiles identified by single molecule sequencing.

DNA polymerases have an innate error rate which is polymerase and DNA context specific. Historically the mutational rate and profiles have been measured using a variety of methods, each with their own technical limitations. Here we used the unique properties of single molecule sequencing to evaluate the mutational rate and profiles of six DNA polymerases at the sequence level. In addition to accurately determining mutations in double strands, single molecule sequencing also captures direction specific transversions and transitions through the analysis of heteroduplexes. Not only did the error rates vary, but also the direction specific transitions differed among polymerases. Copyright © 2016 Elsevier B.V. All rights reserved.


July 19, 2019

Mitotic intragenic recombination: A mechanism of survival for several congenital disorders of glycosylation.

Congenital disorders of glycosylation (CDGs) are disorders of abnormal protein glycosylation that affect multiple organ systems. Because most CDGs have been described in only a few individuals, our understanding of the associated phenotypes and the mechanisms of individual survival are limited. In the process of studying two siblings, aged 6 and 11 years, with MOGS-CDG and biallelic MOGS (mannosyl-oligosaccharide glucosidase) mutations (GenBank: NM_006302.2; c.[65C>A; 329G>A] p.[Ala22Glu; Arg110His]; c.[370C>T] p.[Gln124(*)]), we noted that their survival was much longer than the previous report of MOGS-CDG, in a child who died at 74 days of age. Upon mutation analysis, we detected multiple MOGS genotypes including wild-type alleles in their cultured fibroblast and peripheral blood DNA. Further analysis of DNA from cultured fibroblasts of six individuals with compound heterozygous mutations of PMM2 (PMM2-CDG), MPI (MPI-CDG), ALG3 (ALG3-CDG), ALG12 (ALG12-CDG), DPAGT1 (DPAGT1-CDG), and ALG1 (ALG1-CDG) also identified multiple genotypes including wild-type alleles for each. Droplet digital PCR showed a ratio of nearly 1:1 wild-type to mutant alleles for most, but not all, mutations. This suggests that mitotic recombination contributes to the survival and the variable expressivity of individuals with compound heterozygous CDGs. This also provides an explanation for prior observations of a reduced frequency of homozygous mutations and might contribute to increased levels of residual enzyme activity in cultured fibroblasts of individuals with MPI- and PMM2-CDGs. Copyright © 2016 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.


July 19, 2019

Chromosomal-level assembly of the Asian seabass genome using long sequence reads and multi-layered scaffolding.

We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species’ native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics.


July 19, 2019

Bats may eat diurnal flies that rest on wind turbines

Bats are currently killed in large numbers at wind turbines worldwide, but the ultimate reason why this happens remains poorly understood. One hypothesis is that bats visit wind turbines to feed on insects exposed at the turbine towers. We used single molecule next generation DNA sequencing to identify stomach contents of 18 bats of four species (Pipistrellus pygmaeus, Nyctalus noctula, Eptesicus nilssonii and Vespertilio murinus) found dead under wind turbines in southern Sweden. Stomach contents were diverse but included typically diurnal flies, e.g. blow-flies (Calliphoridae), flesh-flies (Sarcophagidae) and houseflies (Muscidae) and also several flightless taxa. Such prey items were eaten by all bat species and at all wind turbine localities and it seems possible that they had been captured at or near the surface of the turbines at night. Using sticky traps, we documented an abundance of swarming (diurnal) ants (Myrmica spp.) and sometimes blow-flies and houseflies at the nacelle house. Near the base of the tower the catches were more diverse and corresponded better with the taxa found in the bat stomachs, including various diurnal flies. To evaluate if flies and other insects resting on the surface of a wind turbine are available to bats, we ensonified a house fly (Musca) on a smooth (plastic) surface with synthetic ultrasonic pulses of the frequencies used by the bat species that we had sampled. The experiment revealed potentially useful echoes, provided the attack angle was low and the frequency high (50–75 kHz). Hence resting flies and other arthropods can probably be detected by echolocating bats on the surface of a wind turbine. Our findings are consistent with published observations of the behavior of bats at wind turbines and may actually explain the function of some of these behaviors.


July 19, 2019

Nested Russian doll-like genetic mobility drives rapid dissemination of the Carbapenem resistance gene blaKPC

The recent widespread emergence of carbapenem resistance in Enterobacteriaceae is a major public health concern, as carbapenems are a therapy of last resort against this family of common bacterial pathogens. Resistance genes can mobilize via various mechanisms, including conjugation and transposition; however, the importance of this mobility in short-term evolution, such as within nosocomial outbreaks, is unknown. Using a combination of short- and long-read whole-genome sequencing of 281 blaKPC-positive Enterobacteriaceae isolates from a single hospital over 5 years, we demonstrate rapid dissemination of this carbapenem resistance gene to multiple species, strains, and plasmids. Mobility of blaKPC occurs at multiple nested genetic levels, with transmission of blaKPC strains between individuals, frequent transfer of blaKPC plasmids between strains/species, and frequent transposition of blaKPC transposon Tn4401 between plasmids. We also identify a common insertion site for Tn4401 within various Tn2-like elements, suggesting that homologous recombination between Tn2-like elements has enhanced the spread of Tn4401 between different plasmid vectors. Furthermore, while short-read sequencing has known limitations for plasmid assembly, various studies have attempted to overcome this by the use of reference-based methods. We also demonstrate that, as a consequence of the genetic mobility observed in this study, plasmid structures can be extremely dynamic, and therefore these reference-based methods, as well as traditional partial typing methods, can produce very misleading conclusions. Overall, our findings demonstrate that nonclonal resistance gene dissemination can be extremely rapid, presenting significant challenges for public health surveillance and achieving effective control of antibiotic resistance. Copyright © 2016 Sheppard et al.


July 19, 2019

Continuous evolution of Bacillus thuringiensis toxins overcomes insect resistance.

The Bacillus thuringiensis d-endotoxins (Bt toxins) are widely used insecticidal proteins in engineered crops that provide agricultural, economic, and environmental benefits. The development of insect resistance to Bt toxins endangers their long-term effectiveness. Here we have developed a phage-assisted continuous evolution selection that rapidly evolves high-affinity protein-protein interactions, and applied this system to evolve variants of the Bt toxin Cry1Ac that bind a cadherin-like receptor from the insect pest Trichoplusia ni (TnCAD) that is not natively bound by wild-type Cry1Ac. The resulting evolved Cry1Ac variants bind TnCAD with high affinity (dissociation constant Kd?=?11-41?nM), kill TnCAD-expressing insect cells that are not susceptible to wild-type Cry1Ac, and kill Cry1Ac-resistant T. ni insects up to 335-fold more potently than wild-type Cry1Ac. Our findings establish that the evolution of Bt toxins with novel insect cell receptor affinity can overcome insect Bt toxin resistance and confer lethality approaching that of the wild-type Bt toxin against non-resistant insects.


July 19, 2019

Genome structural diversity among 31 Bordetella pertussis isolates from two recent U.S. whooping cough statewide epidemics

During 2010 and 2012, California and Vermont, respectively, experienced statewide epidemics of pertussis with differences seen in the demographic affected, case clinical presentation, and molecular epidemiology of the circulating strains. To overcome limitations of the current molecular typing methods for pertussis, we utilized whole-genome sequencing to gain a broader understanding of how current circulating strains are causing large epidemics. Through the use of combined next-generation sequencing technologies, this study compared de novo, single-contig genome assemblies from 31 out of 33 Bordetella pertussis isolates collected during two separate pertussis statewide epidemics and 2 resequenced vaccine strains. Final genome architecture assemblies were verified with whole-genome optical mapping. Sixteen distinct genome rearrangement profiles were observed in epidemic isolate genomes, all of which were distinct from the genome structures of the two resequenced vaccine strains. These rearrangements appear to be mediated by repetitive sequence elements, such as high-copy-number mobile genetic elements and rRNA operons. Additionally, novel and previously identified single nucleotide polymorphisms were detected in 10 virulence-related genes in the epidemic isolates. Whole-genome variation analysis identified state-specific variants, and coding regions bearing nonsynonymous mutations were classified into functional annotated orthologous groups. Comprehensive studies on whole genomes are needed to understand the resurgence of pertussis and develop novel tools to better characterize the molecular epidemiology of evolving B.~pertussis populations.IMPORTANCE Pertussis, or whooping cough, is the most poorly controlled vaccine-preventable bacterial disease in the United States, which has experienced a resurgence for more than a decade. Once viewed as a monomorphic pathogen, B.~pertussis strains circulating during epidemics exhibit diversity visible on a genome structural level, previously undetectable by traditional sequence analysis using short-read technologies. For the first time, we combine short- and long-read sequencing platforms with restriction optical mapping for single-contig, de novo assembly of 31 isolates to investigate two geographically and temporally independent U.S. pertussis epidemics. These complete genomes reshape our understanding of B.~pertussis evolution and strengthen molecular epidemiology toward one day understanding the resurgence of pertussis.


July 19, 2019

Large deletions at the SHOX locus in the pseudoautosomal region are associated with skeletal atavism in Shetland ponies.

Skeletal atavism in Shetland ponies is a heritable disorder characterized by abnormal growth of the ulna and fibula that extend the carpal and tarsal joints, respectively. This causes abnormal skeletal structure, impaired movements, and affected foals are usually euthanized. In order to identify the causal mutation we subjected six confirmed Swedish cases and a DNA pool consisting of 21 control individuals to whole genome resequencing. We screened for polymorphisms where the cases and the control pool were fixed for opposite alleles and observed this signature for only 25 SNPs, most of which were scattered on genome assembly unassigned scaffolds. Read depth analysis at these loci revealed homozygosity or compound heterozygosity for two partially overlapping large deletions in the pseudoautosomal region (PAR) of chromosome X/Y in cases but not in the control pool. One of these deletions removes the entire coding region of the SHOX gene and both deletions remove parts of the CRLF2 gene located downstream of SHOX. The horse reference assembly of the PAR is highly fragmented, and in order to characterize this region we sequenced bacterial artificial chromosome (BAC) clones by single-molecule real-time (SMRT) sequencing technology. This considerably improved the assembly and enabled size estimations of the two deletions to 160-180 kb and 60-80 kb, respectively. Complete association between the presence of these deletions and disease status was verified in eight other affected horses. The result of the present study is consistent with previous studies in humans showing crucial importance of SHOX for normal skeletal development. Copyright © 2016 Author et al.


July 19, 2019

Genomic changes following the reversal of a Y chromosome to an autosome in Drosophila pseudoobscura

Robertsonian translocations resulting in fusions between sex chromosomes and autosomes shape karyotype evolution by creating new sex chromosomes from autosomes. These translocations can also reverse sex chromosomes back into autosomes, which is especially intriguing given the dramatic differences between autosomes and sex chromosomes. To study the genomic events following a Y chromosome reversal, we investigated an autosome-Y translocation in Drosophila pseudoobscura. The ancestral Y chromosome fused to a small autosome (the dot chromosome) approximately 10–15 Mya. We used single molecule real-time sequencing reads to assemble the D. pseudoobscura dot chromosome, including this Y-to-dot translocation. We find that the intervening sequence between the ancestral Y and the rest of the dot chromosome is only ~78 Kb and is not repeat-dense, suggesting that the centromere now falls outside, rather than between, the fused chromosomes. The Y-to-dot region is 100 times smaller than the D. melanogaster Y chromosome, owing to changes in repeat landscape. However, we do not find a consistent reduction in intron sizes across the Y-to-dot region. Instead, deletions in intergenic regions and possibly a small ancestral Y chromosome size may explain the compact size of the Y-to-dot translocation.


July 19, 2019

Shifting fitness and epistatic landscapes reflect trade-offs along an evolutionary pathway.

Nature repurposes proteins via evolutionary processes. Such adaptation can come at the expense of the original protein’s function, which is a trade-off of adaptation. We sought to examine other potential adaptive trade-offs. We measured the effect on ampicillin resistance of ~12,500 unique single amino acid mutants of the TEM-1, TEM-17, TEM-19, and TEM-15 ß-lactamase alleles, which constitute an adaptive path in the evolution of cefotaxime resistance. These protein fitness landscapes were compared and used to calculate epistatic interactions between these mutations and the two mutations in the pathway (E104K and G238S). This series of protein fitness landscapes provides a systematic, quantitative description of pairwise/tertiary intragenic epistasis involving adaptive mutations. We find that the frequency of mutations exhibiting epistasis increases along the evolutionary pathway. Adaptation moves the protein to a region in the fitness landscape characterized by decreased mutational robustness and increased ruggedness, as measured by fitness effects of mutations and epistatic interactions for TEM-1’s original function. This movement to such a “fitness territory” has evolutionary consequences and is an important adaptive trade-off and cost of adaptation. Our systematic study provides detailed insight into the relationships between mutation, protein structure, protein stability, and epistasis and quantitatively depicts the different costs inherent in the evolution of new functions. Copyright © 2016 Elsevier Ltd. All rights reserved.


July 19, 2019

Single-molecule sequencing reveals complex genomic variation of hepatitis B virus during 15 years of chronic infection following liver transplantation.

Chronic hepatitis B (CHB) is prevalent worldwide. The infectious agent, hepatitis B virus (HBV) replicates via an RNA intermediate and is error-prone, leading to rapid generation of closely related but not identical viral variants, including those that can escape host immune responses and antiviral treatments. The complexity of CHB can be further enhanced by the presence of HBV variants with large deletions in the genome, generated via splicing (spHBV). Although spHBV variants are incapable of autonomous replication, their replication is rescued by wild-type HBV. SpHBV variants have been shown to enhance wild-type virus replication, and their prevalence increases with liver disease progression. Single-molecule deep sequencing was performed on whole HBV genomes extracted from longitudinal samples of a post-liver transplant CHB subject, collected over a 15-year period that included the liver explant. By employing novel bioinformatics methods, this analysis showed a complex dynamics of the viral population across a period of changing treatment regimens. The spHBV detected in the liver explant remained present post-transplantation, along with emergence of a highly diverse novel spHBV population as well as variants with multiple deletions in the preS genes. The identification of novel mutations outside the HBV reverse transcriptase gene that co-occur with known drug resistant mutations, highlight the relevance of using full genome deep sequencing and support the hypothesis that drug resistance involves interactions across the full-length HBV genome.Single-molecule sequencing allowed characterising, in unprecedented detail, the evolution of HBV populations and offered unique insights into the dynamics of defective and spHBV variants following liver transplantation and complex treatment regimes. This analysis also showed rapid adaptation of HBV populations to treatment regimens with evolving drug resistance phenotypes and evidence of purifying selection across the whole genome. Finally, the new open source bioinformatics tools are freely available, with the capacity to easily identify potential spliced variants from deep sequencing data. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


July 19, 2019

AgIn: Measuring the landscape of CpG methylation of individual repetitive elements.

Determining the methylation state of regions with high copy numbers is challenging for second-generation sequencing, because the read length is insufficient to map reads uniquely, especially when repetitive regions are long and nearly identical to each other. Single-molecule real-time (SMRT) sequencing is a promising method for observing such regions, because it is not vulnerable to GC bias, it produces long read lengths, and its kinetic information is sensitive to DNA modifications.We propose a novel linear-time algorithm that combines the kinetic information for neighboring CpG sites and increases the confidence in identifying the methylation states of those sites. Using a practical read coverage of ~30-fold from an inbred strain medaka (Oryzias latipes), we observed that both the sensitivity and precision of our method on individual CpG sites were ~93.7%. We also observed a high correlation coefficient (R?=?0.884) between our method and bisulfite sequencing, and for 92.0% of CpG sites, methylation levels ranging over [0, 1] were in concordance within an acceptable difference 0.25. Using this method, we characterized the landscape of the methylation status of repetitive elements, such as LINEs, in the human genome, thereby revealing the strong correlation between CpG density and hypomethylation and detecting hypomethylation hot spots of LTRs and LINEs. We uncovered the methylation states for nearly identical active transposons, two novel LINE insertions of identity ~99% and length 6050 base pairs (bp) in the human genome, and 16 Tol2 elements of identity >99.8% and length 4682?bp in the medaka genome.AgIn (Aggregate on Intervals) is available at: https://github.com/hacone/AgIn CONTACT: ysuzuki@cb.k.u-tokyo.ac.jp, moris@cb.k.u-tokyo.ac.jp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. © The Author(s) 2016. Published by Oxford University Press.


July 19, 2019

Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11?kb), single molecule, real-time sequencing.

The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [~80.6% (A?+?T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12?kb, with 50% of the reads between 15.5 and 50?kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [~90-99% (A?+?T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.


July 19, 2019

Analysis of tandem gene copies in maize chromosomal regions reconstructed from long sequence reads.

Haplotype variation not only involves SNPs but also insertions and deletions, in particular gene copy number variations. However, comparisons of individual genomes have been difficult because traditional sequencing methods give too short reads to unambiguously reconstruct chromosomal regions containing repetitive DNA sequences. An example of such a case is the protein gene family in maize that acts as a sink for reduced nitrogen in the seed. Previously, 41-48 gene copies of the alpha zein gene family that spread over six loci spanning between 30- and 500-kb chromosomal regions have been described in two Iowa Stiff Stalk (SS) inbreds. Analyses of those regions were possible because of overlapping BAC clones, generated by an expensive and labor-intensive approach. Here we used single-molecule real-time (Pacific Biosciences) shotgun sequencing to assemble the six chromosomal regions from the Non-Stiff Stalk maize inbred W22 from a single DNA sequence dataset. To validate the reconstructed regions, we developed an optical map (BioNano genome map; BioNano Genomics) of W22 and found agreement between the two datasets. Using the sequences of full-length cDNAs from W22, we found that the error rate of PacBio sequencing seemed to be less than 0.1% after autocorrection and assembly. Expressed genes, some with premature stop codons, are interspersed with nonexpressed genes, giving rise to genotype-specific expression differences. Alignment of these regions with those from the previous analyzed regions of SS lines exhibits in part dramatic differences between these two heterotic groups.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.