Menu
July 19, 2019  |  

Separate F-type plasmids have shaped the evolution of the H30 subclone of Escherichia coli sequence type 131.

The extraintestinal pathogenic Escherichia coli (ExPEC) H30 subclone of sequence type 131 (ST131-H30) has emerged abruptly as a dominant lineage of ExPEC responsible for human disease. The ST131-H30 lineage has been well described phylogenetically, yet its plasmid complement is not fully understood. Here, single-molecule, real-time sequencing was used to generate the complete plasmid sequences of ST131-H30 isolates and those belonging to other ST131 clades. Comparative analyses revealed separate F-type plasmids that have shaped the evolution of the main fluoroquinolone-resistant ST131-H30 clades. Specifically, an F1:A2:B20 plasmid is strongly associated with the H30R/C1 clade, whereas an F2:A1:B- plasmid is associated with the H30Rx/C2 clade. A series of plasmid gene losses, gains, and rearrangements involving IS26 likely led to the current plasmid complements within each ST131-H30 sublineage, which contain several overlapping gene clusters with putative functions in virulence and fitness, suggesting plasmid-mediated convergent evolution. Evidence suggests that the H30Rx/C2-associated F2:A1:B- plasmid type was present in strains ancestral to the acquisition of fluoroquinolone resistance and prior to the introduction of a multidrug resistance-encoding gene cassette harboring bla CTX-M-15. In vitro experiments indicated a host strain-independent low frequency of plasmid transfer, differential levels of plasmid stability even between closely related ST131-H30 strains, and possible epistasis for carriage of these plasmids within the H30R/Rx lineages. IMPORTANCE A clonal lineage of Escherichia coli known as ST131 has emerged as a dominating strain type causing extraintestinal infections in humans. The evolutionary history of ST131 E. coli is now well understood. However, the role of plasmids in ST131’s evolutionary history is poorly defined. This study utilized real-time, single-molecule sequencing to compare plasmids from various current and historical lineages of ST131. From this work, it was determined that a series of plasmid gains, losses, and recombinational events has led to the currently circulating plasmids of ST131 strains. These plasmids appear to have evolved to acquire similar gene clusters on multiple occasions, suggesting possible plasmid-mediated convergent evolution leading to evolutionary success. These plasmids also appear to be better suited to exist in specific strains of ST131 due to coadaptive mutations. Overall, a series of events has enabled the evolution of ST131 plasmids, possibly contributing to the lineage’s success.


July 19, 2019  |  

Rapid sequencing of complete env genes from primary HIV-1 samples

The ability to study rapidly evolving viral populations has been constrained by the read length of next-generation sequencing approaches and the sampling depth of single-genome amplification methods. Here, we develop and characterize a method using Pacific Biosciences Single Molecule, Real-Time (SMRT) sequencing technology to sequence multiple, intact full-length human immunodeficiency virus-1 env genes amplified from viral RNA populations circulating in blood, and provide computational tools for analyzing and visualizing these data.


July 19, 2019  |  

Long read sequencing technology to solve complex genomic regions assembly in plants

Background: Numerous completed or on-going whole genome sequencing projects have highlighted the fact that obtaining a high quality genome sequence is necessary to address comparative genomics questions such as structural variations among genotypes and gain or loss of specific function. Despite the spectacular progress that has been made in sequencing technologies, obtaining accurate and reliable data is still a challenge, both at the whole genome scale and when targeting specific genomic regions. These problems are even more noticeable for complex plant genomes. Most plant genomes are known to be particularly challenging due to their size, high density of repetitive elements and various levels of ploidy. To overcome these problems, we have developed a strategy to reduce genome complexity by using the large insert BAC libraries combined with next generation sequencing technologies. Results: We compared two different technologies (Roche-454 and Pacific Biosciences PacBio RS II) to sequence pools of BAC clones in order to obtain the best quality sequence. We targeted nine BAC clones from different species (maize, wheat, strawberry, barley, sugarcane and sunflower) known to be complex in terms of sequence assembly. We sequenced the pools of the nine BAC clones with both technologies. We compared assembly results and highlighted differences due to the sequencing technologies used. Conclusions: We demonstrated that the long reads obtained with the PacBio RS II technology serve to obtain a better and more reliable assembly, notably by preventing errors due to duplicated or repetitive sequences in the same region.


July 19, 2019  |  

ARTISAN PCR: rapid identification of full-length immunoglobulin rearrangements without primer binding bias.

B cells recognize specific antigens by their membrane-bound B-cell receptor (BCR). Functional BCR genes are assembled in pre-B cells by recombination of the variable (V), diversity (D) and joining (J) genes [V(D)J recombination]. When B cells participate in germinal centre reactions, non-templated point mutations are introduced into BCR genes by somatic hypermutation (SHM) (Rajewsky, 1996). V(D)J recombination and SHM create virtually unlimited BCR repertoires.


July 19, 2019  |  

De novo assembly and phasing of a Korean human genome.

Advances in genome assembly and phasing provide an opportunity to investigate the diploid architecture of the human genome and reveal the full range of structural variation across population groups. Here we report the de novo assembly and haplotype phasing of the Korean individual AK1 (ref. 1) using single-molecule real-time sequencing, next-generation mapping, microfluidics-based linked reads, and bacterial artificial chromosome (BAC) sequencing approaches. Single-molecule sequencing coupled with next-generation mapping generated a highly contiguous assembly, with a contig N50 size of 17.9?Mb and a scaffold N50 size of 44.8?Mb, resolving 8 chromosomal arms into single scaffolds. The de novo assembly, along with local assemblies and spanning long reads, closes 105 and extends into 72 out of 190 euchromatic gaps in the reference genome, adding 1.03?Mb of previously intractable sequence. High concordance between the assembly and paired-end sequences from 62,758 BAC clones provides strong support for the robustness of the assembly. We identify 18,210 structural variants by direct comparison of the assembly with the human reference, identifying thousands of breakpoints that, to our knowledge, have not been reported before. Many of the insertions are reflected in the transcriptome and are shared across the Asian population. We performed haplotype phasing of the assembly with short reads, long reads and linked reads from whole-genome sequencing and with short reads from 31,719 BAC clones, thereby achieving phased blocks with an N50 size of 11.6?Mb. Haplotigs assembled from single-molecule real-time reads assigned to haplotypes on phased blocks covered 89% of genes. The haplotigs accurately characterized the hypervariable major histocompatability complex region as well as demonstrating allele configuration in clinically relevant genes such as CYP2D6. This work presents the most contiguous diploid human genome assembly so far, with extensive investigation of unreported and Asian-specific structural variants, and high-quality haplotyping of clinically relevant alleles for precision medicine.


July 19, 2019  |  

Extraction of high-molecular-weight genomic DNA for long-read sequencing of single molecules.

De novo sequencing of complex genomes is one of the main challenges for researchers seeking high-quality reference sequences. Many de novo assemblies are based on short reads, producing fragmented genome sequences. Third-generation sequencing, with read lengths >10 kb, will improve the assembly of complex genomes, but these techniques require high-molecular-weight genomic DNA (gDNA), and gDNA extraction protocols used for obtaining smaller fragments for short-read sequencing are not suitable for this purpose. Methods of preparing gDNA for bacterial artificial chromosome (BAC) libraries could be adapted, but these approaches are time-consuming, and commercial kits for these methods are expensive. Here, we present a protocol for rapid, inexpensive extraction of high-molecular-weight gDNA from bacteria, plants, and animals. Our technique was validated using sunflower leaf samples, producing a mean read length of 12.6 kb and a maximum read length of 80 kb.


July 19, 2019  |  

Full-length mitochondrial-DNA sequencing on the PacBio RSII.

Conventional mitochondrial-DNA (MT DNA) sequencing approaches use Sanger sequencing of 20-40 partially overlapping PCR fragments per individual, which is a time- and resource-consuming process. We have developed a high-throughput, accurate, fast, and cost-effective human MT DNA sequencing approach. In this setup we first generate long-range PCR products for two partially overlapping 7.7 and 9.2 kb MT DNA-specific amplicons, add sample-specific barcodes, and sequence these on the PacBio RSII system to obtain full-length MT DNA sequences for genotyping/haplotyping purposes.


July 19, 2019  |  

Targeted capture and sequencing of gene-sized DNA molecules.

Targeted capture provides an efficient and sensitive means for sequencing specific genomic regions in a high-throughput manner. To date, this method has mostly been used to capture exons from the genome (the exome) using short insert libraries and short-read sequencing technology, enabling the identification of genetic variants or new members of large gene families. Sequencing larger molecules results in the capture of whole genes, including intronic and intergenic sequences that are typically more polymorphic and allow the resolution of the gene structure of homologous genes, which are often clustered together on the chromosome. Here, we describe an improved method for the capture and single-molecule sequencing of DNA molecules as large as 7 kb by means of size selection and optimized PCR conditions. Our approach can be used to capture, sequence, and distinguish between similar members of the NB-LRR gene family-key genes in plant immune systems.


July 19, 2019  |  

Examining sources of error in PCR by single-molecule sequencing.

Next-generation sequencing technology has enabled the detection of rare genetic or somatic mutations and contributed to our understanding of disease progression and evolution. However, many next-generation sequencing technologies first rely on DNA amplification, via the Polymerase Chain Reaction (PCR), as part of sample preparation workflows. Mistakes made during PCR appear in sequencing data and contribute to false mutations that can ultimately confound genetic analysis. In this report, a single-molecule sequencing assay was used to comprehensively catalog the different types of errors introduced during PCR, including polymerase misincorporation, structure-induced template-switching, PCR-mediated recombination and DNA damage. In addition to well-characterized polymerase base substitution errors, other sources of error were found to be equally prevalent. PCR-mediated recombination by Taq polymerase was observed at the single-molecule level, and surprisingly found to occur as frequently as polymerase base substitution errors, suggesting it may be an underappreciated source of error for multiplex amplification reactions. Inverted repeat structural elements in lacZ caused polymerase template-switching between the top and bottom strands during replication and the frequency of these events were measured for different polymerases. For very accurate polymerases, DNA damage introduced during temperature cycling, and not polymerase base substitution errors, appeared to be the major contributor toward mutations occurring in amplification products. In total, we analyzed PCR products at the single-molecule level and present here a more complete picture of the types of mistakes that occur during DNA amplification.


July 19, 2019  |  

Sequencing of Australian wild rice genomes reveals ancestral relationships with domesticated rice.

The related A genome species of the Oryza genus are the effective gene pool for rice. Here, we report draft genomes for two Australian wild A genome taxa: O. rufipogon-like population, referred to as Taxon A, and O. meridionalis-like population, referred to as Taxon B. These two taxa were sequenced and assembled by integration of short- and long-read next-generation sequencing (NGS) data to create a genomic platform for a wider rice gene pool. Here, we report that, despite the distinct chloroplast genome, the nuclear genome of the Australian Taxon A has a sequence that is much closer to that of domesticated rice (O. sativa) than to the other Australian wild populations. Analysis of 4643 genes in the A genome clade showed that the Australian annual, O. meridionalis, and related perennial taxa have the most divergent (around 3 million years) genome sequences relative to domesticated rice. A test for admixture showed possible introgression into the Australian Taxon A (diverged around 1.6 million years ago) especially from the wild indica/O. nivara clade in Asia. These results demonstrate that northern Australia may be the centre of diversity of the A genome Oryza and suggest the possibility that this might also be the centre of origin of this group and represent an important resource for rice improvement.© 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


July 19, 2019  |  

The genome of Chenopodium quinoa.

Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for quinoa, which was produced using single-molecule real-time sequencing in combination with optical, chromosome-contact and genetic maps. We also report the sequencing of two diploids from the ancestral gene pools of quinoa, which enables the identification of sub-genomes in quinoa, and reduced-coverage genome sequences for 22 other samples of the allotetraploid goosefoot complex. The genome sequence facilitated the identification of the transcription factor likely to control the production of anti-nutritional triterpenoid saponins found in quinoa seeds, including a mutation that appears to cause alternative splicing and a premature stop codon in sweet quinoa strains. These genomic resources are an important first step towards the genetic improvement of quinoa.


July 19, 2019  |  

Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome.

The decrease in sequencing cost and increased sophistication of assembly algorithms for short-read platforms has resulted in a sharp increase in the number of species with genome assemblies. However, these assemblies are highly fragmented, with many gaps, ambiguities, and errors, impeding downstream applications. We demonstrate current state of the art for de novo assembly using the domestic goat (Capra hircus) based on long reads for contig formation, short reads for consensus validation, and scaffolding by optical and chromatin interaction mapping. These combined technologies produced what is, to our knowledge, the most continuous de novo mammalian assembly to date, with chromosome-length scaffolds and only 649 gaps. Our assembly represents a ~400-fold improvement in continuity due to properly assembled gaps, compared to the previously published C. hircus assembly, and better resolves repetitive structures longer than 1 kb, representing the largest repeat family and immune gene complex yet produced for an individual of a ruminant species.


July 19, 2019  |  

Genomic confirmation of vancomycin-resistant Enterococcus transmission from deceased donor to liver transplant recipient.

In a liver transplant recipient with vancomycin-resistant Enterococcus (VRE) surgical site and bloodstream infection, a combination of pulsed-field gel electrophoresis, multilocus sequence typing, and whole genome sequencing identified that donor and recipient VRE isolates were highly similar when compared to time-matched hospital isolates. Comparison of de novo assembled isolate genomes was highly suggestive of transplant transmission rather than hospital-acquired transmission and also identified subtle internal rearrangements between donor and recipient missed by other genomic approaches. Given the improved resolution, whole-genome assembly of pathogen genomes is likely to become an essential tool for investigation of potential organ transplant transmissions.


July 19, 2019  |  

Gorilla MHC class I gene and sequence variation in a comparative context.

Comparisons of MHC gene content and diversity among closely related species can provide insights into the evolutionary mechanisms shaping immune system variation. After chimpanzees and bonobos, gorillas are humans’ closest living relatives; but in contrast, relatively little is known about the structure and variation of gorilla MHC class I genes (Gogo). Here, we combined long-range amplifications and long-read sequencing technology to analyze full-length MHC class I genes in 35 gorillas. We obtained 50 full-length genomic sequences corresponding to 15 Gogo-A alleles, 4 Gogo-Oko alleles, 21 Gogo-B alleles, and 10 Gogo-C alleles including 19 novel coding region sequences. We identified two previously undetected MHC class I genes related to Gogo-A and Gogo-B, respectively, thereby illustrating the potential of this approach for efficient and highly accurate MHC genotyping. Consistent with their phylogenetic position within the hominid family, individual gorilla MHC haplotypes share characteristics with humans and chimpanzees as well as orangutans suggesting a complex history of the MHC class I genes in humans and the great apes. However, the overall MHC class I diversity appears to be low further supporting the hypothesis that gorillas might have experienced a reduction of their MHC repertoire.


July 19, 2019  |  

Genomic structure of the horse major histocompatibility complex class II region resolved using PacBio long-read sequencing technology.

The mammalian Major Histocompatibility Complex (MHC) region contains several gene families characterized by highly polymorphic loci with extensive nucleotide diversity, copy number variation of paralogous genes, and long repetitive sequences. This structural complexity has made it difficult to construct a reliable reference sequence of the horse MHC region. In this study, we used long-read single molecule, real-time (SMRT) sequencing technology from Pacific Biosciences (PacBio) to sequence eight Bacterial Artificial Chromosome (BAC) clones spanning the horse MHC class II region. The final assembly resulted in a 1,165,328?bp continuous gap free sequence with 35 manually curated genomic loci of which 23 were considered to be functional and 12 to be pseudogenes. In comparison to the MHC class II region in other mammals, the corresponding region in horse shows extraordinary copy number variation and different relative location and directionality of the Eqca-DRB, -DQA, -DQB and -DOB loci. This is the first long-read sequence assembly of the horse MHC class II region with rigorous manual gene annotation, and it will serve as an important resource for association studies of immune-mediated equine diseases and for evolutionary analysis of genetic diversity in this region.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.