Menu
July 19, 2019

Short tandem repeats, segmental duplications, gene deletion, and genomic instability in a rapidly diversified immune gene family.

Genomic regions with repetitive sequences are considered unstable and prone to swift DNA diversification processes. A highly diverse immune gene family of the sea urchin (Strongylocentrotus purpuratus), called Sp185/333, is composed of clustered genes with similar sequence as well as several types of repeats ranging in size from short tandem repeats (STRs) to large segmental duplications. This repetitive structure may have been the basis for the incorrect assembly of this gene family in the sea urchin genome sequence. Consequently, we have resolved the structure of the family and profiled the members by sequencing selected BAC clones using Illumina and PacBio approaches.BAC insert assemblies identified 15 predicted genes that are organized into three clusters. Two of the gene clusters have almost identical flanking regions, suggesting that they may be non-matching allelic clusters residing at the same genomic locus. GA STRs surround all genes and appear in large stretches at locations of putatively deleted genes. GAT STRs are positioned at the edges of segmental duplications that include a subset of the genes. The unique locations of the STRs suggest their involvement in gene deletions and segmental duplications. Genomic profiling of the Sp185/333 gene diversity in 10 sea urchins shows that no gene repertoires are shared among individuals indicating a very high gene diversification rate for this family.The repetitive genomic structure of the Sp185/333 family that includes STRs in strategic locations may serve as platform for a controlled mechanism which regulates the processes of gene recombination, gene conversion, duplication and deletion. The outcome is genomic instability and allelic mismatches, which may further drive the swift diversification of the Sp185/333 gene family that may improve the immune fitness of the species.


July 19, 2019

IncFIIk plasmid harbouring an amplification of 16S rRNA methyltransferase-encoding gene rmtH associated with mobile element ISCR2.

To investigate the resistance mechanisms and genetic support underlying the high resistance level of the Klebsiella pneumoniae strain CMUL78 to aminoglycoside and ß-lactam antibiotics.Antibiotic susceptibility was assessed by the disc diffusion method and MICs were determined by the microdilution method. Antibiotic resistance genes and their genetic environment were characterized by PCR and Sanger sequencing. Plasmid contents were analysed in the clinical strain and transconjugants obtained by mating-out assays. Complete plasmid sequencing was performed with PacBio and Illumina technology.Strain CMUL78 co-produced the 16S rRNA methyltransferase (RMTase) RmtH, carbapenemase OXA-48 and ESBL SHV-12. The rmtH- and blaSHV-12-encoding genes were harboured by a novel ~115 kb IncFIIk plasmid designated pRmtH, and blaOXA-48 by a ~62 kb IncL/M plasmid related to pOXA-48a. pRmtH plasmid possessed seven different stability modules, one of which is a novel hybrid toxin-antitoxin system. Interestingly, pRmtH plasmid harboured a 4-fold amplification of an rmtH-ISCR2 unit arranged in tandem and inserted within a novel IS26-based composite transposon designated Tn6329.This is the first known report of the 16S RMTase-encoding gene rmtH in a plasmid. The rmtH-ISCR2 unit was inserted in a composite transposon as a 4-fold tandem repeat, a scarcely reported organization.© The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 19, 2019

Full-length mitochondrial-DNA sequencing on the PacBio RSII.

Conventional mitochondrial-DNA (MT DNA) sequencing approaches use Sanger sequencing of 20-40 partially overlapping PCR fragments per individual, which is a time- and resource-consuming process. We have developed a high-throughput, accurate, fast, and cost-effective human MT DNA sequencing approach. In this setup we first generate long-range PCR products for two partially overlapping 7.7 and 9.2 kb MT DNA-specific amplicons, add sample-specific barcodes, and sequence these on the PacBio RSII system to obtain full-length MT DNA sequences for genotyping/haplotyping purposes.


July 19, 2019

Recent advances in inferring viral diversity from high-throughput sequencing data.

Rapidly evolving RNA viruses prevail within a host as a collection of closely related variants, referred to as viral quasispecies. Advances in high-throughput sequencing (HTS) technologies have facilitated the assessment of the genetic diversity of such virus populations at an unprecedented level of detail. However, analysis of HTS data from virus populations is challenging due to short, error-prone reads. In order to account for uncertainties originating from these limitations, several computational and statistical methods have been developed for studying the genetic heterogeneity of virus population. Here, we review methods for the analysis of HTS reads, including approaches to local diversity estimation and global haplotype reconstruction. Challenges posed by aligning reads, as well as the impact of reference biases on diversity estimates are also discussed. In addition, we address some of the experimental approaches designed to improve the biological signal-to-noise ratio. In the future, computational methods for the analysis of heterogeneous virus populations are likely to continue being complemented by technological developments. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.


July 19, 2019

The deep origin and recent loss of venom toxin genes in rattlesnakes.

The genetic origin of novel traits is a central but challenging puzzle in evolutionary biology. Among snakes, phospholipase A2 (PLA2)-related toxins have evolved in different lineages to function as potent neurotoxins, myotoxins, or hemotoxins. Here, we traced the genomic origin and evolution of PLA2 toxins by examining PLA2 gene number, organization, and expression in both neurotoxic and non-neurotoxic rattlesnakes. We found that even though most North American rattlesnakes do not produce neurotoxins, the genes of a specialized heterodimeric neurotoxin predate the origin of rattlesnakes and were present in their last common ancestor (~22 mya). The neurotoxin genes were then deleted independently in the lineages leading to the Western Diamondback (Crotalus atrox) and Eastern Diamondback (C. adamanteus) rattlesnakes (~6 mya), while a PLA2 myotoxin gene retained in C. atrox was deleted from the neurotoxic Mojave rattlesnake (C. scutulatus; ~4 mya). The rapid evolution of PLA2 gene number appears to be due to transposon invasion that provided a template for non-allelic homologous recombination. Copyright © 2016 Elsevier Ltd. All rights reserved.


July 19, 2019

Single-molecule sequencing revealing the presence of distinct JC polyomavirus populations in patients with progressive multifocal leukoencephalopathy.

Progressive multifocal leukoencephalopathy (PML) is a fatal disease caused by reactivation of JC polyomavirus (JCPyV) in immunosuppressed individuals and lytic infection by neurotropic JCPyV in glial cells. The exact content of neurotropic mutations within individual JCPyV strains has not been studied to our knowledge.We exploited the capacity of single-molecule real-time sequencing technology to determine the sequence of complete JCPyV genomes in single reads. The method was used to precisely characterize individual neurotropic JCPyV strains of 3 patients with PML without the bias caused by assembly of short sequence reads.In the cerebrospinal fluid sample of a 73-year-old woman with rapid PML onset, 3 distinct JCPyV populations could be identified. All viral populations were characterized by rearrangements within the noncoding regulatory region (NCCR) and 1 point mutation, S267L in the VP1 gene, suggestive of neurotropic strains. One patient with PML had a single neurotropic strain with rearranged NCCR, and 1 patient had a single strain with small NCCR alterations.We report here, for the first time, full characterization of individual neurotropic JCPyV strains in the cerebrospinal fluid of patients with PML. It remains to be established whether PML pathogenesis is driven by one or several neurotropic strains in an individual.


July 19, 2019

Rapid functional and sequence differentiation of a tandemly repeated species-specific multigene family in Drosophila.

Gene clusters of recently duplicated genes are hotbeds for evolutionary change. However, our understanding of how mutational mechanisms and evolutionary forces shape the structural and functional evolution of these clusters is hindered by the high sequence identity among the copies, which typically results in their inaccurate representation in genome assemblies. The presumed testis-specific, chimeric gene Sdic originated, and tandemly expanded in Drosophila melanogaster, contributing to increased male-male competition. Using various types of massively parallel sequencing data, we studied the organization, sequence evolution, and functional attributes of the different Sdic copies. By leveraging long-read sequencing data, we uncovered both copy number and order differences from the currently accepted annotation for the Sdic region. Despite evidence for pervasive gene conversion affecting the Sdic copies, we also detected signatures of two episodes of diversifying selection, which have contributed to the evolution of a variety of C-termini and miRNA binding site compositions. Expression analyses involving RNA-seq datasets from 59 different biological conditions revealed distinctive expression breadths among the copies, with three copies being transcribed in females, opening the possibility to a sexually antagonistic effect. Phenotypic assays using Sdic knock-out strains indicated that should this antagonistic effect exist, it does not compromise female fertility. Our results strongly suggest that the genome consolidation of the Sdic gene cluster is more the result of a quick exploration of different paths of molecular tinkering by different copies than a mere dosage increase, which could be a recurrent evolutionary outcome in the presence of persistent sexual selection. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


July 19, 2019

CGG repeat-induced FMR1 silencing depends on the expansion size in human iPSCs and neurons carrying unmethylated full mutations.

In fragile X syndrome (FXS), CGG repeat expansion greater than 200 triplets is believed to trigger FMR1 gene silencing and disease etiology. However, FXS siblings have been identified with more than 200 CGGs, termed unmethylated full mutation (UFM) carriers, without gene silencing and disease symptoms. Here, we show that hypomethylation of the FMR1 promoter is maintained in induced pluripotent stem cells (iPSCs) derived from two UFM individuals. However, a subset of iPSC clones with large CGG expansions carries silenced FMR1. Furthermore, we demonstrate de novo silencing upon expansion of the CGG repeat size. FMR1 does not undergo silencing during neuronal differentiation of UFM iPSCs, and expression of large unmethylated CGG repeats has phenotypic consequences resulting in neurodegenerative features. Our data suggest that UFM individuals do not lack the cell-intrinsic ability to silence FMR1 and that inter-individual variability in the CGG repeat size required for silencing exists in the FXS population. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.


July 19, 2019

Examining sources of error in PCR by single-molecule sequencing.

Next-generation sequencing technology has enabled the detection of rare genetic or somatic mutations and contributed to our understanding of disease progression and evolution. However, many next-generation sequencing technologies first rely on DNA amplification, via the Polymerase Chain Reaction (PCR), as part of sample preparation workflows. Mistakes made during PCR appear in sequencing data and contribute to false mutations that can ultimately confound genetic analysis. In this report, a single-molecule sequencing assay was used to comprehensively catalog the different types of errors introduced during PCR, including polymerase misincorporation, structure-induced template-switching, PCR-mediated recombination and DNA damage. In addition to well-characterized polymerase base substitution errors, other sources of error were found to be equally prevalent. PCR-mediated recombination by Taq polymerase was observed at the single-molecule level, and surprisingly found to occur as frequently as polymerase base substitution errors, suggesting it may be an underappreciated source of error for multiplex amplification reactions. Inverted repeat structural elements in lacZ caused polymerase template-switching between the top and bottom strands during replication and the frequency of these events were measured for different polymerases. For very accurate polymerases, DNA damage introduced during temperature cycling, and not polymerase base substitution errors, appeared to be the major contributor toward mutations occurring in amplification products. In total, we analyzed PCR products at the single-molecule level and present here a more complete picture of the types of mistakes that occur during DNA amplification.


July 19, 2019

Deletion-bias in DNA double-strand break repair differentially contributes to plant genome shrinkage.

In order to prevent genome instability, cells need to be protected by a number of repair mechanisms, including DNA double-strand break (DSB) repair. The extent to which DSB repair, biased towards deletions or insertions, contributes to evolutionary diversification of genome size is still under debate. We analyzed mutation spectra in Arabidopsis thaliana and in barley (Hordeum vulgare) by PacBio sequencing of three DSB-targeted loci each, uncovering repair via gene conversion, single strand annealing (SSA) or nonhomologous end-joining (NHEJ). Furthermore, phylogenomic comparisons between A. thaliana and two related species were used to detect naturally occurring deletions during Arabidopsis evolution. Arabidopsis thaliana revealed significantly more and larger deletions after DSB repair than barley, and barley displayed more and larger insertions. Arabidopsis displayed a clear net loss of DNA after DSB repair, mainly via SSA and NHEJ. Barley revealed a very weak net loss of DNA, apparently due to less active break-end resection and easier copying of template sequences into breaks. Comparative phylogenomics revealed several footprints of SSA in the A. thaliana genome. Quantitative assessment of DNA gain and loss through DSB repair processes suggests deletion-biased DSB repair causing ongoing genome shrinking in A. thaliana, whereas genome size in barley remains nearly constant.© 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.


July 19, 2019

Characterization of hepatitis C virus (HCV) envelope diversification from acute to chronic infection within a sexually transmitted HCV cluster by using single-molecule, real-time sequencing.

In contrast to other available next-generation sequencing platforms, PacBio single-molecule, real-time (SMRT) sequencing has the advantage of generating long reads albeit with a relatively higher error rate in unprocessed data. Using this platform, we longitudinally sampled and sequenced the hepatitis C virus (HCV) envelope genome region (1,680 nucleotides [nt]) from individuals belonging to a cluster of sexually transmitted cases. All five subjects were coinfected with HIV-1 and a closely related strain of HCV genotype 4d. In total, 50 samples were analyzed by using SMRT sequencing. By using 7 passes of circular consensus sequencing, the error rate was reduced to 0.37%, and the median number of sequences was 612 per sample. A further reduction of insertions was achieved by alignment against a sample-specific reference sequence. However, in vitro recombination during PCR amplification could not be excluded. Phylogenetic analysis supported close relationships among HCV sequences from the four male subjects and subsequent transmission from one subject to his female partner. Transmission was characterized by a strong genetic bottleneck. Viral genetic diversity was low during acute infection and increased upon progression to chronicity but subsequently fluctuated during chronic infection, caused by the alternate detection of distinct coexisting lineages. SMRT sequencing combines long reads with sufficient depth for many phylogenetic analyses and can therefore provide insights into within-host HCV evolutionary dynamics without the need for haplotype reconstruction using statistical algorithms.IMPORTANCE Next-generation sequencing has revolutionized the study of genetically variable RNA virus populations, but for phylogenetic and evolutionary analyses, longer sequences than those generated by most available platforms, while minimizing the intrinsic error rate, are desired. Here, we demonstrate for the first time that PacBio SMRT sequencing technology can be used to generate full-length HCV envelope sequences at the single-molecule level, providing a data set with large sequencing depth for the characterization of intrahost viral dynamics. The selection of consensus reads derived from at least 7 full circular consensus sequencing rounds significantly reduced the intrinsic high error rate of this method. We used this method to genetically characterize a unique transmission cluster of sexually transmitted HCV infections, providing insight into the distinct evolutionary pathways in each patient over time and identifying the transmission-associated genetic bottleneck as well as fluctuations in viral genetic diversity over time, accompanied by dynamic shifts in viral subpopulations. Copyright © 2017 American Society for Microbiology.


July 19, 2019

Genomic confirmation of vancomycin-resistant Enterococcus transmission from deceased donor to liver transplant recipient.

In a liver transplant recipient with vancomycin-resistant Enterococcus (VRE) surgical site and bloodstream infection, a combination of pulsed-field gel electrophoresis, multilocus sequence typing, and whole genome sequencing identified that donor and recipient VRE isolates were highly similar when compared to time-matched hospital isolates. Comparison of de novo assembled isolate genomes was highly suggestive of transplant transmission rather than hospital-acquired transmission and also identified subtle internal rearrangements between donor and recipient missed by other genomic approaches. Given the improved resolution, whole-genome assembly of pathogen genomes is likely to become an essential tool for investigation of potential organ transplant transmissions.


July 19, 2019

Diversity and activity of alternative nitrogenases in sequenced genomes and coastal environments.

The nitrogenase enzyme, which catalyzes the reduction of N2 gas to NH4(+), occurs as three separate isozyme that use Mo, Fe-only, or V. The majority of global nitrogen fixation is attributed to the more efficient ‘canonical’ Mo-nitrogenase, whereas Fe-only and V-(‘alternative’) nitrogenases are often considered ‘backup’ enzymes, used when Mo is limiting. Yet, the environmental distribution and diversity of alternative nitrogenases remains largely unknown. We searched for alternative nitrogenase genes in sequenced genomes and used PacBio sequencing to explore the diversity of canonical (nifD) and alternative (anfD and vnfD) nitrogenase amplicons in two coastal environments: the Florida Everglades and Sippewissett Marsh (MA). Genome-based searches identified an additional 25 species and 10 genera not previously known to encode alternative nitrogenases. Alternative nitrogenase amplicons were found in both Sippewissett Marsh and the Florida Everglades and their activity was further confirmed using newly developed isotopic techniques. Conserved amino acid sequences corresponding to cofactor ligands were also analyzed in anfD and vnfD amplicons, offering insight into environmental variants of these motifs. This study increases the number of available anfD and vnfD sequences ~20-fold and allows for the first comparisons of environmental Mo-, Fe-only, and V-nitrogenase diversity. Our results suggest that alternative nitrogenases are maintained across a range of organisms and environments and that they can make important contributions to nitrogenase diversity and nitrogen fixation.


July 19, 2019

Gorilla MHC class I gene and sequence variation in a comparative context.

Comparisons of MHC gene content and diversity among closely related species can provide insights into the evolutionary mechanisms shaping immune system variation. After chimpanzees and bonobos, gorillas are humans’ closest living relatives; but in contrast, relatively little is known about the structure and variation of gorilla MHC class I genes (Gogo). Here, we combined long-range amplifications and long-read sequencing technology to analyze full-length MHC class I genes in 35 gorillas. We obtained 50 full-length genomic sequences corresponding to 15 Gogo-A alleles, 4 Gogo-Oko alleles, 21 Gogo-B alleles, and 10 Gogo-C alleles including 19 novel coding region sequences. We identified two previously undetected MHC class I genes related to Gogo-A and Gogo-B, respectively, thereby illustrating the potential of this approach for efficient and highly accurate MHC genotyping. Consistent with their phylogenetic position within the hominid family, individual gorilla MHC haplotypes share characteristics with humans and chimpanzees as well as orangutans suggesting a complex history of the MHC class I genes in humans and the great apes. However, the overall MHC class I diversity appears to be low further supporting the hypothesis that gorillas might have experienced a reduction of their MHC repertoire.


July 19, 2019

Genomic structure of the horse major histocompatibility complex class II region resolved using PacBio long-read sequencing technology.

The mammalian Major Histocompatibility Complex (MHC) region contains several gene families characterized by highly polymorphic loci with extensive nucleotide diversity, copy number variation of paralogous genes, and long repetitive sequences. This structural complexity has made it difficult to construct a reliable reference sequence of the horse MHC region. In this study, we used long-read single molecule, real-time (SMRT) sequencing technology from Pacific Biosciences (PacBio) to sequence eight Bacterial Artificial Chromosome (BAC) clones spanning the horse MHC class II region. The final assembly resulted in a 1,165,328?bp continuous gap free sequence with 35 manually curated genomic loci of which 23 were considered to be functional and 12 to be pseudogenes. In comparison to the MHC class II region in other mammals, the corresponding region in horse shows extraordinary copy number variation and different relative location and directionality of the Eqca-DRB, -DQA, -DQB and -DOB loci. This is the first long-read sequence assembly of the horse MHC class II region with rigorous manual gene annotation, and it will serve as an important resource for association studies of immune-mediated equine diseases and for evolutionary analysis of genetic diversity in this region.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.