Menu
September 21, 2019

Comparative genomics of enterohemorrhagic Escherichia coli O145:H28 demonstrates a common evolutionary lineage with Escherichia coli O157:H7.

Although serotype O157:H7 is the predominant enterohemorrhagic Escherichia coli (EHEC), outbreaks of non-O157 EHEC that cause severe foodborne illness, including hemolytic uremic syndrome have increased worldwide. In fact, non-O157 serotypes are now estimated to cause over half of all the Shiga toxin-producing Escherichia coli (STEC) cases, and outbreaks of non-O157 EHEC infections are frequently associated with serotypes O26, O45, O103, O111, O121, and O145. Currently, there are no complete genomes for O145 in public databases.We determined the complete genome sequences of two O145 strains (EcO145), one linked to a US lettuce-associated outbreak (RM13514) and one to a Belgium ice-cream-associated outbreak (RM13516). Both strains contain one chromosome and two large plasmids, with genome sizes of 5,737,294 bp for RM13514 and 5,559,008 bp for RM13516. Comparative analysis of the two EcO145 genomes revealed a large core (5,173 genes) and a considerable amount of strain-specific genes. Additionally, the two EcO145 genomes display distinct chromosomal architecture, virulence gene profile, phylogenetic origin of Stx2a prophage, and methylation profile (methylome). Comparative analysis of EcO145 genomes to other completely sequenced STEC and other E. coli and Shigella genomes revealed that, unlike any other known non-O157 EHEC strain, EcO145 ascended from a common lineage with EcO157/EcO55. This evolutionary relationship was further supported by the pangenome analysis of the 10 EHEC str ains. Of the 4,192 EHEC core genes, EcO145 shares more genes with EcO157 than with the any other non-O157 EHEC strains.Our data provide evidence that EcO145 and EcO157 evolved from a common lineage, but ultimately each serotype evolves via a lineage-independent nature to EHEC by acquisition of the core set of EHEC virulence factors, including the genes encoding Shiga toxin and the large virulence plasmid. The large variation between the two EcO145 genomes suggests a distinctive evolutionary path between the two outbreak strains. The distinct methylome between the two EcO145 strains is likely due to the presence of a BsuBI/PstI methyltransferase gene cassette in the Stx2a prophage of the strain RM13514, suggesting a role of horizontal gene transfer-mediated epigenetic alteration in the evolution of individual EHEC strains.


September 21, 2019

Mistranslation drives the evolution of robustness in TEM-1 ß-lactamase.

How biological systems such as proteins achieve robustness to ubiquitous perturbations is a fundamental biological question. Such perturbations include errors that introduce phenotypic mutations into nascent proteins during the translation of mRNA. These errors are remarkably frequent. They are also costly, because they reduce protein stability and help create toxic misfolded proteins. Adaptive evolution might reduce these costs of protein mistranslation by two principal mechanisms. The first increases the accuracy of translation via synonymous “high fidelity” codons at especially sensitive sites. The second increases the robustness of proteins to phenotypic errors via amino acids that increase protein stability. To study how these mechanisms are exploited by populations evolving in the laboratory, we evolved the antibiotic resistance gene TEM-1 in Escherichia coli hosts with either normal or high rates of mistranslation. We analyzed TEM-1 populations that evolved under relaxed and stringent selection for antibiotic resistance by single molecule real-time sequencing. Under relaxed selection, mistranslating populations reduce mistranslation costs by reducing TEM-1 expression. Under stringent selection, they efficiently purge destabilizing amino acid changes. More importantly, they accumulate stabilizing amino acid changes rather than synonymous changes that increase translational accuracy. In the large populations we study, and on short evolutionary timescales, the path of least resistance in TEM-1 evolution consists of reducing the consequences of translation errors rather than the errors themselves.


September 21, 2019

Characterization of multi-drug resistant Enterococcus faecalis isolated from cephalic recording chambers in research macaques (Macaca spp.).

Nonhuman primates are commonly used for cognitive neuroscience research and often surgically implanted with cephalic recording chambers for electrophysiological recording. Aerobic bacterial cultures from 25 macaques identified 72 bacterial isolates, including 15 Enterococcus faecalis isolates. The E. faecalis isolates displayed multi-drug resistant phenotypes, with resistance to ciprofloxacin, enrofloxacin, trimethoprim-sulfamethoxazole, tetracycline, chloramphenicol, bacitracin, and erythromycin, as well as high-level aminoglycoside resistance. Multi-locus sequence typing showed that most belonged to two E. faecalis sequence types (ST): ST 4 and ST 55. The genomes of three representative isolates were sequenced to identify genes encoding antimicrobial resistances and other traits. Antimicrobial resistance genes identified included aac(6′)-aph(2″), aph(3′)-III, str, ant(6)-Ia, tetM, tetS, tetL, ermB, bcrABR, cat, and dfrG, and polymorphisms in parC (S80I) and gyrA (S83I) were observed. These isolates also harbored virulence factors including the cytolysin toxin genes in ST 4 isolates, as well as multiple biofilm-associated genes (esp, agg, ace, SrtA, gelE, ebpABC), hyaluronidases (hylA, hylB), and other survival genes (ElrA, tpx). Crystal violet biofilm assays confirmed that ST 4 isolates produced more biofilm than ST 55 isolates. The abundance of antimicrobial resistance and virulence factor genes in the ST 4 isolates likely relates to the loss of CRISPR-cas. This macaque colony represents a unique model for studying E. faecalis infection associated with indwelling devices, and provides an opportunity to understand the basis of persistence of this pathogen in a healthcare setting.


September 21, 2019

Whole genome sequence of the soybean aphid, Aphis glycines.

Aphids are emerging as model organisms for both basic and applied research. Of the 5,000 estimated species, only three aphids have published whole genome sequences: the pea aphid Acyrthosiphon pisum, the Russian wheat aphid, Diuraphis noxia, and the green peach aphid, Myzus persicae. We present the whole genome sequence of a fourth aphid, the soybean aphid (Aphis glycines), which is an extreme specialist and an important invasive pest of soybean (Glycine max). The availability of genomic resources is important to establish effective and sustainable pest control, as well as to expand our understanding of aphid evolution. We generated a 302.9 Mbp draft genome assembly for Ap. glycines using a hybrid sequencing approach. This assembly shows high completeness with 19,182 predicted genes, 92% of known Ap. glycines transcripts mapping to contigs, and substantial continuity with a scaffold N50 of 174,505 bp. The assembly represents 95.5% of the predicted genome size of 317.1 Mbp based on flow cytometry. Ap. glycines contains the smallest known aphid genome to date, based on updated genome sizes for 19 aphid species. The repetitive DNA content of the Ap. glycines genome assembly (81.6 Mbp or 26.94% of the 302.9 Mbp assembly) shows a reduction in the number of classified transposable elements compared to Ac. pisum, and likely contributes to the small estimated genome size. We include comparative analyses of gene families related to host-specificity (cytochrome P450’s and effectors), which may be important in Ap. glycines evolution. This Ap. glycines draft genome sequence will provide a resource for the study of aphid genome evolution, their interaction with host plants, and candidate genes for novel insect control methods. Copyright © 2017 Elsevier Ltd. All rights reserved.


September 21, 2019

PacBio assembly of a Plasmodium knowlesi genome sequence with Hi-C correction and manual annotation of the SICAvar gene family.

Plasmodium knowlesi has risen in importance as a zoonotic parasite that has been causing regular episodes of malaria throughout South East Asia. The P. knowlesi genome sequence generated in 2008 highlighted and confirmed many similarities and differences in Plasmodium species, including a global view of several multigene families, such as the large SICAvar multigene family encoding the variant antigens known as the schizont-infected cell agglutination proteins. However, repetitive DNA sequences are the bane of any genome project, and this and other Plasmodium genome projects have not been immune to the gaps, rearrangements and other pitfalls created by these genomic features. Today, long-read PacBio and chromatin conformation technologies are overcoming such obstacles. Here, based on the use of these technologies, we present a highly refined de novo P. knowlesi genome sequence of the Pk1(A+) clone. This sequence and annotation, referred to as the ‘MaHPIC Pk genome sequence’, includes manual annotation of the SICAvar gene family with 136 full-length members categorized as type I or II. This sequence provides a framework that will permit a better understanding of the SICAvar repertoire, selective pressures acting on this gene family and mechanisms of antigenic variation in this species and other pathogens.


September 21, 2019

Potato late blight field resistance from QTL dPI09c is conferred by the NB-LRR gene R8.

Following the often short-lived protection that major nucleotide binding, leucine-rich-repeat (NB-LRR) resistance genes offer against the potato pathogen Phytophthora infestans, field resistance was thought to provide a more durable alternative to prevent late blight disease. We previously identified the QTL dPI09c on potato chromosome 9 as a more durable field resistance source against late blight. Here, the resistance QTL was fine-mapped to a 186 kb region. The interval corresponds to a larger, 389 kb, genomic region in the potato reference genome of Solanum tuberosum Group Phureja doubled monoploid clone DM1-3 (DM) and from which functional NB-LRRs R8, R9a, Rpi-moc1, and Rpi_vnt1 have arisen independently in wild species. dRenSeq analysis of parental clones alongside resistant and susceptible bulks of the segregating population B3C1HP showed full sequence representation of R8. This was independently validated using long-range PCR and screening of a bespoke bacterial artificial chromosome library. The latter enabled a comparative analysis of the sequence variation in this locus in diverse Solanaceae. We reveal for the first time that broad spectrum and durable field resistance against P. infestans is conferred by the NB-LRR gene R8, which is thought to provide narrow spectrum race-specific resistance.


September 21, 2019

A flexible and efficient template format for circular consensus sequencing and SNP detection.

A novel template design for single-molecule sequencing is introduced, a structure we refer to as a SMRTbell template. This structure consists of a double-stranded portion, containing the insert of interest, and a single-stranded hairpin loop on either end, which provides a site for primer binding. Structurally, this format resembles a linear double-stranded molecule, and yet it is topologically circular. When placed into a single-molecule sequencing reaction, the SMRTbell template format enables a consensus sequence to be obtained from multiple passes on a single molecule. Furthermore, this consensus sequence is obtained from both the sense and antisense strands of the insert region. In this article, we present a universal method for constructing these templates, as well as an application of their use. We demonstrate the generation of high-quality consensus accuracy from single molecules, as well as the use of SMRTbell templates in the identification of rare sequence variants.


September 21, 2019

Identification of a novel RASD1 somatic mutation in a USP8-mutated corticotroph adenoma.

Cushing’s disease (CD) is caused by pituitary corticotroph adenomas that secrete excess adrenocorticotropic hormone (ACTH). In these tumors, somatic mutations in the gene USP8 have been identified as recurrent and pathogenic and are the sole known molecular driver for CD. Although other somatic mutations were reported in these studies, their contribution to the pathogenesis of CD remains unexplored. No molecular drivers have been established for a large proportion of CD cases and tumor heterogeneity has not yet been investigated using genomics methods. Also, even in USP8-mutant tumors, a possibility may exist of additional contributing mutations, following a paradigm from other neoplasm types where multiple somatic alterations contribute to neoplastic transformation. The current study utilizes whole-exome discovery sequencing on the Illumina platform, followed by targeted amplicon-validation sequencing on the Pacific Biosciences platform, to interrogate the somatic mutation landscape in a corticotroph adenoma resected from a CD patient. In this USP8-mutated tumor, we identified an interesting somatic mutation in the gene RASD1, which is a component of the corticotropin-releasing hormone receptor signaling system. This finding may provide insight into a novel mechanism involving loss of feedback control to the corticotropin-releasing hormone receptor and subsequent deregulation of ACTH production in corticotroph tumors.


September 21, 2019

Long-read genome sequencing identifies causal structural variation in a Mendelian disease.

PurposeCurrent clinical genomics assays primarily utilize short-read sequencing (SRS), but SRS has limited ability to evaluate repetitive regions and structural variants. Long-read sequencing (LRS) has complementary strengths, and we aimed to determine whether LRS could offer a means to identify overlooked genetic variation in patients undiagnosed by SRS.MethodsWe performed low-coverage genome LRS to identify structural variants in a patient who presented with multiple neoplasia and cardiac myxomata, in whom the results of targeted clinical testing and genome SRS were negative.ResultsThis LRS approach yielded 6,971 deletions and 6,821 insertions?>?50?bp. Filtering for variants that are absent in an unrelated control and overlap a disease gene coding exon identified three deletions and three insertions. One of these, a heterozygous 2,184?bp deletion, overlaps the first coding exon of PRKAR1A, which is implicated in autosomal dominant Carney complex. RNA sequencing demonstrated decreased PRKAR1A expression. The deletion was classified as pathogenic based on guidelines for interpretation of sequence variants.ConclusionThis first successful application of genome LRS to identify a pathogenic variant in a patient suggests that LRS has significant potential for the identification of disease-causing structural variation. Larger studies will ultimately be required to evaluate the potential clinical utility of LRS.


September 21, 2019

The axolotl genome and the evolution of key tissue formation regulators.

Salamanders serve as important tetrapod models for developmental, regeneration and evolutionary studies. An extensive molecular toolkit makes the Mexican axolotl (Ambystoma mexicanum) a key representative salamander for molecular investigations. Here we report the sequencing and assembly of the 32-gigabase-pair axolotl genome using an approach that combined long-read sequencing, optical mapping and development of a new genome assembler (MARVEL). We observed a size expansion of introns and intergenic regions, largely attributable to multiplication of long terminal repeat retroelements. We provide evidence that intron size in developmental genes is under constraint and that species-restricted genes may contribute to limb regeneration. The axolotl genome assembly does not contain the essential developmental gene Pax3. However, mutation of the axolotl Pax3 paralogue Pax7 resulted in an axolotl phenotype that was similar to those seen in Pax3-/- and Pax7-/- mutant mice. The axolotl genome provides a rich biological resource for developmental and evolutionary studies.


September 21, 2019

A Sequel to Sanger: amplicon sequencing that scales.

Although high-throughput sequencers (HTS) have largely displaced their Sanger counterparts, the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing. The present study tests the capacity of single molecule, real-time (SMRT) sequencing implemented on the SEQUEL platform to overcome these limitations, employing 658 bp amplicons of the mitochondrial cytochrome c oxidase I gene as a model system.By examining templates from more than 5000 species and 20,000 specimens, the performance of SMRT sequencing was tested with amplicons showing wide variation in GC composition and varied sequence attributes. SMRT and Sanger sequences were very similar, but SMRT sequencing provided more complete coverage, especially for amplicons with homopolymer tracts. Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL can reduce greatly reduce sequencing costs in comparison to first (Sanger) and second generation platforms (Illumina, Ion).SMRT analysis generates high-fidelity sequences from amplicons with varying GC content and is resilient to homopolymer tracts. Analytical costs are low, substantially less than those for first or second generation sequencers. When implemented on the SEQUEL platform, SMRT analysis enables massive amplicon characterization because each instrument can recover sequences from more than 5 million DNA extracts a year.


September 21, 2019

Population sequencing reveals clonal diversity and ancestral inbreeding in the grapevine cultivar Chardonnay.

Chardonnay is the basis of some of the world’s most iconic wines and its success is underpinned by a historic program of clonal selection. There are numerous clones of Chardonnay available that exhibit differences in key viticultural and oenological traits that have arisen from the accumulation of somatic mutations during centuries of asexual propagation. However, the genetic variation that underlies these differences remains largely unknown. To address this knowledge gap, a high-quality, diploid-phased Chardonnay genome assembly was produced from single-molecule real time sequencing, and combined with re-sequencing data from 15 different Chardonnay clones. There were 1620 markers identified that distinguish the 15 clones. These markers were reliably used for clonal identification of independently sourced genomic material, as well as in identifying a potential genetic basis for some clonal phenotypic differences. The predicted parentage of the Chardonnay haplomes was elucidated by mapping sequence data from the predicted parents of Chardonnay (Gouais blanc and Pinot noir) against the Chardonnay reference genome. This enabled the detection of instances of heterosis, with differentially-expanded gene families being inherited from the parents of Chardonnay. Most surprisingly however, the patterns of nucleotide variation present in the Chardonnay genome indicate that Pinot noir and Gouais blanc share an extremely high degree of kinship that has resulted in the Chardonnay genome displaying characteristics that are indicative of inbreeding.


July 19, 2019

Parallel confocal detection of single molecules in real time.

The confocal detection principle is extended to a highly parallel optical system that continuously analyzes thousands of concurrent sample locations. This is achieved through the use of a holographic laser illumination multiplexer combined with a confocal pinhole array before a prism dispersive element used to provide spectroscopic information from each confocal volume. The system is demonstrated to detect and identify single fluorescent molecules from each of several thousand independent confocal volumes in real time.


July 19, 2019

Comparison of single-molecule sequencing and hybrid approaches for finishing the genome of Clostridium autoethanogenum and analysis of CRISPR systems in industrial relevant Clostridia.

Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published.A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G + C content of 31.1%. Illumina, 454, Illumina/454 hybrid assemblies were generated and then compared to the draft and PacBio assemblies using summary statistics, CGAL, QUAST and REAPR bioinformatics tools and comparative genomic approaches. Assemblies based upon shorter read DNA technologies were confounded by the large number repeats and their size, which in the case of the rRNA gene operons were ~5 kb. CRISPR (Clustered Regularly Interspaced Short Paloindromic Repeats) systems among biotechnologically relevant Clostridia were classified and related to plasmid content and prophages. Potential associations between plasmid content and CRISPR systems may have implications for historical industrial scale Acetone-Butanol-Ethanol (ABE) fermentation failures and future large scale bacterial fermentations. While C. autoethanogenum contains an active CRISPR system, no such system is present in the closely related Clostridium ljungdahlii DSM 13528. A common prophage inserted into the Arg-tRNA shared between the strains suggests a common ancestor. However, C. ljungdahlii contains several additional putative prophages and it has more than double the amount of prophage DNA compared to C. autoethanogenum. Other differences include important metabolic genes for central metabolism (as an additional hydrogenase and the absence of a phophoenolpyruvate synthase) and substrate utilization pathway (mannose and aromatics utilization) that might explain phenotypic differences between C. autoethanogenum and C. ljungdahlii.Single molecule sequencing will be increasingly used to produce finished microbial genomes. The complete genome will facilitate comparative genomics and functional genomics and support future comparisons between Clostridia and studies that examine the evolution of plasmids, bacteriophage and CRISPR systems.


July 19, 2019

Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution.

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data.Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution.While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.