Menu
September 22, 2019  |  

Targeted long-read sequencing of a locus under long-term balancing selection in Capsella.

Rapid advances in short-read DNA sequencing technologies have revolutionized population genomic studies, but there are genomic regions where this technology reaches its limits. Limitations mostly arise due to the difficulties in assembly or alignment to genomic regions of high sequence divergence and high repeat content, which are typical characteristics for loci under strong long-term balancing selection. Studying genetic diversity at such loci therefore remains challenging. Here, we investigate the feasibility and error rates associated with targeted long-read sequencing of a locus under balancing selection. For this purpose, we generated bacterial artificial chromosomes (BACs) containing the Brassicaceae S-locus, a region under strong negative frequency-dependent selection which has previously proven difficult to assemble in its entirety using short reads. We sequence S-locus BACs with single-molecule long-read sequencing technology and conduct de novo assembly of these S-locus haplotypes. By comparing repeated assemblies resulting from independent long-read sequencing runs on the same BAC clone we do not detect any structural errors, suggesting that reliable assemblies are generated, but we estimate an indel error rate of 5.7×10-5 A similar error rate was estimated based on comparison of Illumina short-read sequences and BAC assemblies. Our results show that, until de novo assembly of multiple individuals using long-read sequencing becomes feasible, targeted long-read sequencing of loci under balancing selection is a viable option with low error rates for single nucleotide polymorphisms or structural variation. We further find that short-read sequencing is a valuable complement, allowing correction of the relatively high rate of indel errors that result from this approach. Copyright © 2018 Bachmann et al.


September 22, 2019  |  

Repeated evolution of self-compatibility for reproductive assurance.

Sexual reproduction in eukaryotes requires the fusion of two compatible gametes of opposite sexes or mating types. To meet the challenge of finding a mating partner with compatible gametes, evolutionary mechanisms such as hermaphroditism and self-fertilization have repeatedly evolved. Here, by combining the insights from comparative genomics, computer simulations and experimental evolution in fission yeast, we shed light on the conditions promoting separate mating types or self-compatibility by mating-type switching. Analogous to multiple independent transitions between switchers and non-switchers in natural populations mediated by structural genomic changes, novel switching genotypes readily evolved under selection in the experimental populations. Detailed fitness measurements accompanied by computer simulations show the benefits and costs of switching during sexual and asexual reproduction, governing the occurrence of both strategies in nature. Our findings illuminate the trade-off between the benefits of reproductive assurance and its fitness costs under benign conditions facilitating the evolution of self-compatibility.


September 22, 2019  |  

RAD sequencing and a hybrid Antarctic fur seal genome assembly reveal rapidly decaying linkage disequilibrium, global population structure and evidence for inbreeding.

Recent advances in high throughput sequencing have transformed the study of wild organisms by facilitating the generation of high quality genome assemblies and dense genetic marker datasets. These resources have the potential to significantly advance our understanding of diverse phenomena at the level of species, populations and individuals, ranging from patterns of synteny through rates of linkage disequilibrium (LD) decay and population structure to individual inbreeding. Consequently, we used PacBio sequencing to refine an existing Antarctic fur seal (Arctocephalus gazella) genome assembly and genotyped 83 individuals from six populations using restriction site associated DNA (RAD) sequencing. The resulting hybrid genome comprised 6,169 scaffolds with an N50 of 6.21 Mb and provided clear evidence for the conservation of large chromosomal segments between the fur seal and dog (Canis lupus familiaris). Focusing on the most extensively sampled population of South Georgia, we found that LD decayed rapidly, reaching the background level by around 400 kb, consistent with other vertebrates but at odds with the notion that fur seals experienced a strong historical bottleneck. We also found evidence for population structuring, with four main Antarctic island groups being resolved. Finally, appreciable variance in individual inbreeding could be detected, reflecting the strong polygyny and site fidelity of the species. Overall, our study contributes important resources for future genomic studies of fur seals and other pinnipeds while also providing a clear example of how high throughput sequencing can generate diverse biological insights at multiple levels of organization. Copyright © 2018 Humble et al.


September 22, 2019  |  

Extensive genomic diversity among Mycobacterium marinum strains revealed by whole genome sequencing.

Mycobacterium marinum is the causative agent for the tuberculosis-like disease mycobacteriosis in fish and skin lesions in humans. Ubiquitous in its geographical distribution, M. marinum is known to occupy diverse fish as hosts. However, information about its genomic diversity is limited. Here, we provide the genome sequences for 15 M. marinum strains isolated from infected humans and fish. Comparative genomic analysis of these and four available genomes of the M. marinum strains M, E11, MB2 and Europe reveal high genomic diversity among the strains, leading to the conclusion that M. marinum should be divided into two different clusters, the “M”- and the “Aronson”-type. We suggest that these two clusters should be considered to represent two M. marinum subspecies. Our data also show that the M. marinum pan-genome for both groups is open and expanding and we provide data showing high number of mutational hotspots in M. marinum relative to other mycobacteria such as Mycobacterium tuberculosis. This high genomic diversity might be related to the ability of M. marinum to occupy different ecological niches.


September 22, 2019  |  

Exploring benzimidazole resistance in Haemonchus contortus by next generation sequencing and droplet digital PCR.

Anthelmintic resistance in gastrointestinal nematode (GIN) parasites of grazing ruminants is on the rise in countries across the world. Haemonchus contortus is one of most frequently encountered drug-resistant GINs in small ruminants. This blood-sucking abomasal nematode contributes to massive treatment costs and poses a serious threat to farm animal health. To prevent the establishment of resistant strains of this parasite, up-to-date molecular techniques need to be proposed which would allow for quick, cheap and accurate identification of individuals infected with resistant worms. The effort has been made in the previous decade, with the development of the pyrosequencing method to detect resistance-predicting alleles. Here we propose a novel droplet digital PCR (ddPCR) assay for rapid and precise identification of H. contortus strains as being resistant or susceptible to benzimidazole drugs based on the presence or absence of the most common resistance-conferring mutation F200Y (TAC) in the ß tubulin isotype 1 gene. The newly developed ddPCR assay was first optimized and validated utilizing DNA templates from single-worm samples, which were previously sequenced using the next generation PacBio RSII Sequencing (NGS) platform. Subsequent NGS results for faecal larval cultures were then used as a reference to compare the obtained values for fractional abundances of the resistance-determining mutant allele between ddPCR and NGS techniques in each sample. Both methods managed to produce highly similar results and ddPCR proved to be a reliable tool which, when utilized at full capacity, can be used to create a powerful mutation detection and quantification assay. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.


September 22, 2019  |  

Convergent evolution of complex genomic rearrangements in two fungal meiotic drive elements.

Meiotic drive is widespread in nature. The conflict it generates is expected to be an important motor for evolutionary change and innovation. In this study, we investigated the genomic consequences of two large multi-gene meiotic drive elements, Sk-2 and Sk-3, found in the filamentous ascomycete Neurospora intermedia. Using long-read sequencing, we generated the first complete and well-annotated genome assemblies of large, highly diverged, non-recombining regions associated with meiotic drive elements. Phylogenetic analysis shows that, even though Sk-2 and Sk-3 are located in the same chromosomal region, they do not form sister clades, suggesting independent origins or at least a long evolutionary separation. We conclude that they have in a convergent manner accumulated similar patterns of tandem inversions and dense repeat clusters, presumably in response to similar needs to create linkage between genes causing drive and resistance.


September 22, 2019  |  

Endogenous rRNA sequence variation can regulate stress response gene expression and phenotype.

Prevailing dogma holds that ribosomes are uniform in composition and function. Here, we show that nutrient limitation-induced stress in E. coli changes the relative expression of rDNA operons to alter the rRNA composition within the actively translating ribosome pool. The most upregulated operon encodes the unique 16S rRNA, rrsH, distinguished by conserved sequence variation within the small ribosomal subunit. rrsH-bearing ribosomes affect the expression of functionally coherent gene sets and alter the levels of the RpoS sigma factor, the master regulator of the general stress response. These impacts are associated with phenotypic changes in antibiotic sensitivity, biofilm formation, and cell motility and are regulated by stress response proteins, RelA and RelE, as well as the metabolic enzyme and virulence-associated protein, AdhE. These findings establish that endogenously encoded, naturally occurring rRNA sequence variation can modulate ribosome function, central aspects of gene expression regulation, and cellular physiology. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.


September 22, 2019  |  

How complete are “complete” genome assemblies?-An avian perspective.

The genomics revolution has led to the sequencing of a large variety of nonmodel organisms often referred to as “whole” or “complete” genome assemblies. But how complete are these, really? Here, we use birds as an example for nonmodel vertebrates and find that, although suitable in principle for genomic studies, the current standard of short-read assemblies misses a significant proportion of the expected genome size (7% to 42%; mean 20 ± 9%). In particular, regions with strongly deviating nucleotide composition (e.g., guanine-cytosine-[GC]-rich) and regions highly enriched in repetitive DNA (e.g., transposable elements and satellite DNA) are usually underrepresented in assemblies. However, long-read sequencing technologies successfully characterize many of these underrepresented GC-rich or repeat-rich regions in several bird genomes. For instance, only ~2% of the expected total base pairs are missing in the last chicken reference (galGal5). These assemblies still contain thousands of gaps (i.e., fragmented sequences) because some chromosomal structures (e.g., centromeres) likely contain arrays of repetitive DNA that are too long to bridge with currently available technologies. We discuss how to minimize the number of assembly gaps by combining the latest available technologies with complementary strengths. At last, we emphasize the importance of knowing the location, size and potential content of assembly gaps when making population genetic inferences about adjacent genomic regions.© 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.


July 19, 2019  |  

CGGBP1 mitigates cytosine methylation at repetitive DNA sequences.

CGGBP1 is a repetitive DNA-binding transcription regulator with target sites at CpG-rich sequences such as CGG repeats and Alu-SINEs and L1-LINEs. The role of CGGBP1 as a possible mediator of CpG methylation however remains unknown. At CpG-rich sequences cytosine methylation is a major mechanism of transcriptional repression. Concordantly, gene-rich regions typically carry lower levels of CpG methylation than the repetitive elements. It is well known that at interspersed repeats Alu-SINEs and L1-LINEs high levels of CpG methylation constitute a transcriptional silencing and retrotransposon inactivating mechanism.Here, we have studied genome-wide CpG methylation with or without CGGBP1-depletion. By high throughput sequencing of bisulfite-treated genomic DNA we have identified CGGBP1 to be a negative regulator of CpG methylation at repetitive DNA sequences. In addition, we have studied CpG methylation alterations on Alu and L1 retrotransposons in CGGBP1-depleted cells using a novel bisulfite-treatment and high throughput sequencing approach.The results clearly show that CGGBP1 is a possible bidirectional regulator of CpG methylation at Alus, and acts as a repressor of methylation at L1 retrotransposons.


July 19, 2019  |  

Large deletions at the SHOX locus in the pseudoautosomal region are associated with skeletal atavism in Shetland ponies.

Skeletal atavism in Shetland ponies is a heritable disorder characterized by abnormal growth of the ulna and fibula that extend the carpal and tarsal joints, respectively. This causes abnormal skeletal structure, impaired movements, and affected foals are usually euthanized. In order to identify the causal mutation we subjected six confirmed Swedish cases and a DNA pool consisting of 21 control individuals to whole genome resequencing. We screened for polymorphisms where the cases and the control pool were fixed for opposite alleles and observed this signature for only 25 SNPs, most of which were scattered on genome assembly unassigned scaffolds. Read depth analysis at these loci revealed homozygosity or compound heterozygosity for two partially overlapping large deletions in the pseudoautosomal region (PAR) of chromosome X/Y in cases but not in the control pool. One of these deletions removes the entire coding region of the SHOX gene and both deletions remove parts of the CRLF2 gene located downstream of SHOX. The horse reference assembly of the PAR is highly fragmented, and in order to characterize this region we sequenced bacterial artificial chromosome (BAC) clones by single-molecule real-time (SMRT) sequencing technology. This considerably improved the assembly and enabled size estimations of the two deletions to 160-180 kb and 60-80 kb, respectively. Complete association between the presence of these deletions and disease status was verified in eight other affected horses. The result of the present study is consistent with previous studies in humans showing crucial importance of SHOX for normal skeletal development. Copyright © 2016 Author et al.


July 19, 2019  |  

Genomic structure of the horse major histocompatibility complex class II region resolved using PacBio long-read sequencing technology.

The mammalian Major Histocompatibility Complex (MHC) region contains several gene families characterized by highly polymorphic loci with extensive nucleotide diversity, copy number variation of paralogous genes, and long repetitive sequences. This structural complexity has made it difficult to construct a reliable reference sequence of the horse MHC region. In this study, we used long-read single molecule, real-time (SMRT) sequencing technology from Pacific Biosciences (PacBio) to sequence eight Bacterial Artificial Chromosome (BAC) clones spanning the horse MHC class II region. The final assembly resulted in a 1,165,328?bp continuous gap free sequence with 35 manually curated genomic loci of which 23 were considered to be functional and 12 to be pseudogenes. In comparison to the MHC class II region in other mammals, the corresponding region in horse shows extraordinary copy number variation and different relative location and directionality of the Eqca-DRB, -DQA, -DQB and -DOB loci. This is the first long-read sequence assembly of the horse MHC class II region with rigorous manual gene annotation, and it will serve as an important resource for association studies of immune-mediated equine diseases and for evolutionary analysis of genetic diversity in this region.


July 19, 2019  |  

A novel approach using long-read sequencing and ddPCR to investigate gonadal mosaicism and estimate recurrence risk in two families with developmental disorders.

De novo mutations contribute significantly to severe early-onset genetic disorders. Even if the mutation is apparently de novo, there is a recurrence risk due to parental germ line mosaicism, depending on in which gonadal generation the mutation occurred.We demonstrate the power of using SMRT sequencing and ddPCR to determine parental origin and allele frequencies of de novo mutations in germ cells in two families whom had undergone assisted reproduction.In the first family, a TCOF1 variant c.3156C>T was identified in the proband with Treacher Collins syndrome. The variant affects splicing and was determined to be of paternal origin. It was present in <1% of the paternal germ cells, suggesting a very low recurrence risk. In the second family, the couple had undergone several unsuccessful pregnancies where a de novo mutation PTPN11 c.923A>C causing Noonan syndrome was identified. The variant was present in 40% of the paternal germ cells suggesting a high recurrence risk.Our findings highlight a successful strategy to identify the parental origin of mutations and to investigate the recurrence risk in couples that have undergone assisted reproduction with an unknown donor or in couples with gonadal mosaicism that will undergo preimplantation genetic diagnosis.© 2017 The Authors Prenatal Diagnosis published by John Wiley & Sons Ltd.


July 19, 2019  |  

Amplification-free, CRISPR-Cas9 targeted enrichment and SMRT Sequencing of repeat-expansion disease causative genomic regions

Targeted sequencing has proven to be an economical means of obtaining sequence information for one or more defined regions of a larger genome. However, most target enrichment methods require amplification. Some genomic regions, such as those with extreme GC content and repetitive sequences, are recalcitrant to faithful amplification. Yet, many human genetic disorders are caused by repeat expansions, including difficult to sequence tandem repeats. We have developed a novel, amplification-free enrichment technique that employs the CRISPR-Cas9 system for specific targeting multiple genomic loci. This method, in conjunction with long reads generated through Single Molecule, Real-Time (SMRT) sequencing and unbiased coverage, enables enrichment and sequencing of complex genomic regions that cannot be investigated with other technologies. Using human genomic DNA samples, we demonstrate successful targeting of causative loci for Huntingtontextquoterights disease (HTT; CAG repeat), Fragile X syndrome (FMR1; CGG repeat), amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (C9orf72; GGGGCC repeat), and spinocerebellar ataxia type 10 (SCA10) (ATXN10; variable ATTCT repeat). The method, amenable to multiplexing across multiple genomic loci, uses an amplification-free approach that facilitates the isolation of hundreds of individual on-target molecules in a single SMRT Cell and accurate sequencing through long repeat stretches, regardless of extreme GC percent or sequence complexity content. Our novel targeted sequencing method opens new doors to genomic analyses independent of PCR amplification that will facilitate the study of repeat expansion disorders.


July 19, 2019  |  

The evolution of dark matter in the mitogenome of seed beetles.

Animal mitogenomes are generally thought of as being economic and optimized for rapid replication and transcription. We use long-read sequencing technology to assemble the remarkable mitogenomes of four species of seed beetles. These are the largest circular mitogenomes ever assembled in insects, ranging from 24,496 to 26,613?bp in total length, and are exceptional in that some 40% consists of non-coding DNA. The size expansion is due to two very long intergenic spacers (LIGSs), rich in tandem repeats. The two LIGSs are present in all species but vary greatly in length (114-10,408?bp), show very low sequence similarity, divergent tandem repeat motifs, a very high AT content and concerted length evolution. The LIGSs have been retained for at least some 45 my but must have undergone repeated reductions and expansions, despite strong purifying selection on protein coding mtDNA genes. The LIGSs are located in two intergenic sites where a few recent studies of insects have also reported shorter LIGSs (>200?bp). These sites may represent spaces that tolerate neutral repeat array expansions or, alternatively, the LIGSs may function to allow a more economic translational machinery. Mitochondrial respiration in adult seed beetles is based almost exclusively on fatty acids, which reduces the need for building complex I of the oxidative phosphorylation pathway (NADH dehydrogenase). One possibility is thus that the LIGSs may allow depressed transcription of NAD genes. RNA sequencing showed that LIGSs are partly transcribed and transcriptional profiling suggested that all seven mtDNA NAD genes indeed show low levels of transcription and co-regulation of transcription across sexes and tissues.© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 19, 2019  |  

Detailed analysis of HTT repeat elements in human blood using targeted amplification-free long-read sequencing.

Amplification of DNA is required as a mandatory step during library preparation in most targeted sequencing protocols. This can be a critical limitation when targeting regions that are highly repetitive or with extreme guanine-cytosine (GC) content, including repeat expansions associated with human disease. Here, we used an amplification-free protocol for targeted enrichment utilizing the CRISPR/Cas9 system (No-Amp Targeted sequencing) in combination with single molecule, real-time (SMRT) sequencing for studying repeat elements in the huntingtin (HTT) gene, where an expanded CAG repeat is causative for Huntington disease. We also developed a robust data analysis pipeline for repeat element analysis that is independent of alignment of reads to a reference genome. The method was applied to 11 diagnostic blood samples, and for all 22 alleles the resulting CAG repeat count agreed with previous results based on fragment analysis. The amplification-free protocol also allowed for studying somatic variability of repeat elements in our samples, without the interference of PCR stutter. In summary, with No-Amp Targeted sequencing in combination with our analysis pipeline, we could accurately study repeat elements that are difficult to investigate using PCR-based methods.© 2018 The Authors. Human Mutation published by Wiley Periodicals, Inc.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.