Menu
September 21, 2019  |  

Long-read genome sequencing identifies causal structural variation in a Mendelian disease.

PurposeCurrent clinical genomics assays primarily utilize short-read sequencing (SRS), but SRS has limited ability to evaluate repetitive regions and structural variants. Long-read sequencing (LRS) has complementary strengths, and we aimed to determine whether LRS could offer a means to identify overlooked genetic variation in patients undiagnosed by SRS.MethodsWe performed low-coverage genome LRS to identify structural variants in a patient who presented with multiple neoplasia and cardiac myxomata, in whom the results of targeted clinical testing and genome SRS were negative.ResultsThis LRS approach yielded 6,971 deletions and 6,821 insertions?>?50?bp. Filtering for variants that are absent in an unrelated control and overlap a disease gene coding exon identified three deletions and three insertions. One of these, a heterozygous 2,184?bp deletion, overlaps the first coding exon of PRKAR1A, which is implicated in autosomal dominant Carney complex. RNA sequencing demonstrated decreased PRKAR1A expression. The deletion was classified as pathogenic based on guidelines for interpretation of sequence variants.ConclusionThis first successful application of genome LRS to identify a pathogenic variant in a patient suggests that LRS has significant potential for the identification of disease-causing structural variation. Larger studies will ultimately be required to evaluate the potential clinical utility of LRS.


September 21, 2019  |  

The axolotl genome and the evolution of key tissue formation regulators.

Salamanders serve as important tetrapod models for developmental, regeneration and evolutionary studies. An extensive molecular toolkit makes the Mexican axolotl (Ambystoma mexicanum) a key representative salamander for molecular investigations. Here we report the sequencing and assembly of the 32-gigabase-pair axolotl genome using an approach that combined long-read sequencing, optical mapping and development of a new genome assembler (MARVEL). We observed a size expansion of introns and intergenic regions, largely attributable to multiplication of long terminal repeat retroelements. We provide evidence that intron size in developmental genes is under constraint and that species-restricted genes may contribute to limb regeneration. The axolotl genome assembly does not contain the essential developmental gene Pax3. However, mutation of the axolotl Pax3 paralogue Pax7 resulted in an axolotl phenotype that was similar to those seen in Pax3-/- and Pax7-/- mutant mice. The axolotl genome provides a rich biological resource for developmental and evolutionary studies.


September 21, 2019  |  

A Sequel to Sanger: amplicon sequencing that scales.

Although high-throughput sequencers (HTS) have largely displaced their Sanger counterparts, the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing. The present study tests the capacity of single molecule, real-time (SMRT) sequencing implemented on the SEQUEL platform to overcome these limitations, employing 658 bp amplicons of the mitochondrial cytochrome c oxidase I gene as a model system.By examining templates from more than 5000 species and 20,000 specimens, the performance of SMRT sequencing was tested with amplicons showing wide variation in GC composition and varied sequence attributes. SMRT and Sanger sequences were very similar, but SMRT sequencing provided more complete coverage, especially for amplicons with homopolymer tracts. Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL can reduce greatly reduce sequencing costs in comparison to first (Sanger) and second generation platforms (Illumina, Ion).SMRT analysis generates high-fidelity sequences from amplicons with varying GC content and is resilient to homopolymer tracts. Analytical costs are low, substantially less than those for first or second generation sequencers. When implemented on the SEQUEL platform, SMRT analysis enables massive amplicon characterization because each instrument can recover sequences from more than 5 million DNA extracts a year.


September 21, 2019  |  

Detecting AGG interruptions in females with a FMR1 premutation by long-read Single-Molecule Sequencing: A 1 year clinical experience.

The fragile X syndrome arises from the FMR1 CGG expansion of a premutation (55-200 repeats) to a full mutation allele (>200 repeats) and is the most frequent cause of inherited X-linked intellectual disability. The risk for a premutation to expand to a full mutation allele depends on the repeat length and AGG triplets interrupting this repeat. In genetic counseling it is important to have information on both these parameters to provide an accurate risk estimate to women carrying a premutation allele and weighing up having children. For example, in case of a small risk a woman might opt for a natural pregnancy followed up by prenatal diagnosis while she might choose for preimplantation genetic diagnosis (PGD) if the risk is high. Unfortunately, the detection of AGG interruptions was previously hampered by technical difficulties complicating their use in diagnostics. Therefore we recently developed, validated and implemented a new methodology which uses long-read single-molecule sequencing to identify AGG interruptions in females with a FMR1 premutation. Here we report on the assets of AGG interruption detection by sequencing and the impact of implementing the assay on genetic counseling.


September 21, 2019  |  

Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements.

CRISPR-Cas9 is poised to become the gene editing tool of choice in clinical contexts. Thus far, exploration of Cas9-induced genetic alterations has been limited to the immediate vicinity of the target site and distal off-target sequences, leading to the conclusion that CRISPR-Cas9 was reasonably specific. Here we report significant on-target mutagenesis, such as large deletions and more complex genomic rearrangements at the targeted sites in mouse embryonic stem cells, mouse hematopoietic progenitors and a human differentiated cell line. Using long-read sequencing and long-range PCR genotyping, we show that DNA breaks introduced by single-guide RNA/Cas9 frequently resolved into deletions extending over many kilobases. Furthermore, lesions distal to the cut site and crossover events were identified. The observed genomic damage in mitotically active cells caused by CRISPR-Cas9 editing may have pathogenic consequences.


July 7, 2019  |  

Azotobacter genomes: The genome of Azotobacter chroococcum NCIMB 8003 (ATCC 4412).

The genome of the soil-dwelling heterotrophic N2-fixing Gram-negative bacterium Azotobacter chroococcum NCIMB 8003 (ATCC 4412) (Ac-8003) has been determined. It consists of 7 circular replicons totalling 5,192,291 bp comprising a circular chromosome of 4,591,803 bp and six plasmids pAcX50a, b, c, d, e, f of 10,435 bp, 13,852, 62,783, 69,713, 132,724, and 311,724 bp respectively. The chromosome has a G+C content of 66.27% and the six plasmids have G+C contents of 58.1, 55.3, 56.7, 59.2, 61.9, and 62.6% respectively. The methylome has also been determined and 5 methylation motifs have been identified. The genome also contains a very high number of transposase/inactivated transposase genes from at least 12 of the 17 recognised insertion sequence families. The Ac-8003 genome has been compared with that of Azotobacter vinelandii ATCC BAA-1303 (Av-DJ), a derivative of strain O, the only other member of the Azotobacteraceae determined so far which has a single chromosome of 5,365,318 bp and no plasmids. The chromosomes show significant stretches of synteny throughout but also reveal a history of many deletion/insertion events. The Ac-8003 genome encodes 4628 predicted protein-encoding genes of which 568 (12.2%) are plasmid borne. 3048 (65%) of these show > 85% identity to the 5050 protein-encoding genes identified in Av-DJ, and of these 99 are plasmid-borne. The core biosynthetic and metabolic pathways and macromolecular architectures and machineries of these organisms appear largely conserved including genes for CO-dehydrogenase, formate dehydrogenase and a soluble NiFe-hydrogenase. The genetic bases for many of the detailed phenotypic differences reported for these organisms have also been identified. Also many other potential phenotypic differences have been uncovered. Properties endowed by the plasmids are described including the presence of an entire aerobic corrin synthesis pathway in pAcX50f and the presence of genes for retro-conjugation in pAcX50c. All these findings are related to the potentially different environmental niches from which these organisms were isolated and to emerging theories about how microbes contribute to their communities.


July 7, 2019  |  

Novel recA-independent horizontal gene transfer in Escherichia coli K-12.

In bacteria, mechanisms that incorporate DNA into a genome without strand-transfer proteins such as RecA play a major role in generating novelty by horizontal gene transfer. We describe a new illegitimate recombination event in Escherichia coli K-12: RecA-independent homologous replacements, with very large (megabase-length) donor patches replacing recipient DNA. A previously uncharacterized gene (yjiP) increases the frequency of RecA-independent replacement recombination. To show this, we used conjugal DNA transfer, combining a classical conjugation donor, HfrH, with modern genome engineering methods and whole genome sequencing analysis to enable interrogation of genetic dependence of integration mechanisms and characterization of recombination products. As in classical experiments, genomic DNA transfer begins at a unique position in the donor, entering the recipient via conjugation; antibiotic resistance markers are then used to select recombinant progeny. Different configurations of this system were used to compare known mechanisms for stable DNA incorporation, including homologous recombination, F’-plasmid formation, and genome duplication. A genome island of interest known as the immigration control region was specifically replaced in a minority of recombinants, at a frequency of 3 X 10-12 CFU/recipient per hour.


July 7, 2019  |  

Retrohoming of a mobile group II intron in human cells suggests how eukaryotes limit group II intron proliferation.

Mobile bacterial group II introns are evolutionary ancestors of spliceosomal introns and retroelements in eukaryotes. They consist of an autocatalytic intron RNA (a “ribozyme”) and an intron-encoded reverse transcriptase, which function together to promote intron integration into new DNA sites by a mechanism termed “retrohoming”. Although mobile group II introns splice and retrohome efficiently in bacteria, all examined thus far function inefficiently in eukaryotes, where their ribozyme activity is limited by low Mg2+ concentrations, and intron-containing transcripts are subject to nonsense-mediated decay (NMD) and translational repression. Here, by using RNA polymerase II to express a humanized group II intron reverse transcriptase and T7 RNA polymerase to express intron transcripts resistant to NMD, we find that simply supplementing culture medium with Mg2+ induces the Lactococcus lactis Ll.LtrB intron to retrohome into plasmid and chromosomal sites, the latter at frequencies up to ~0.1%, in viable HEK-293 cells. Surprisingly, under these conditions, the Ll.LtrB intron reverse transcriptase is required for retrohoming but not for RNA splicing as in bacteria. By using a genetic assay for in vivo selections combined with deep sequencing, we identified intron RNA mutations that enhance retrohoming in human cells, but <4-fold and not without added Mg2+. Further, the selected mutations lie outside the ribozyme catalytic core, which appears not readily modified to function efficiently at low Mg2+ concentrations. Our results reveal differences between group II intron retrohoming in human cells and bacteria and suggest constraints on critical nucleotide residues of the ribozyme core that limit how much group II intron retrohoming in eukaryotes can be enhanced. These findings have implications for group II intron use for gene targeting in eukaryotes and suggest how differences in intracellular Mg2+ concentrations between bacteria and eukarya may have impacted the evolution of introns and gene expression mechanisms.


July 7, 2019  |  

Genome sequence of Bacillus endophyticus and analysis of its companion mechanism in the Ketogulonigenium vulgare-Bacillus strain consortium.

Bacillus strains have been widely used as the companion strain of Ketogulonigenium vulgare in the process of vitamin C fermentation. Different Bacillus strains generate different effects on the growth of K. vulgare and ultimately influence the productivity. First, we identified that Bacillus endophyticus Hbe603 was an appropriate strain to cooperate with K. vulgare and the product conversion rate exceeded 90% in industrial vitamin C fermentation. Here, we report the genome sequencing of the B. endophyticus Hbe603 industrial companion strain and speculate its possible advantage in the consortium. The circular chromosome of B. endophyticus Hbe603 has a size of 4.87 Mb with GC content of 36.64% and has the highest similarity with that of Bacillus megaterium among all the bacteria with complete genomes. By comparing the distribution of COGs with that of Bacillus thuringiensis, Bacillus cereus and B. megaterium, B. endophyticus has less genes related to cell envelope biogenesis and signal transduction mechanisms, and more genes related to carbohydrate transport and metabolism, energy production and conversion, as well as lipid transport and metabolism. Genome-based functional studies revealed the specific capability of B. endophyticus in sporulation, transcription regulation, environmental resistance, membrane transportation, extracellular proteins and nutrients synthesis, which would be beneficial for K. vulgare. In particular, B. endophyticus lacks the Rap-Phr signal cascade system and, in part, spore coat related proteins. In addition, it has specific pathways for vitamin B12 synthesis and sorbitol metabolism. The genome analysis of the industrial B. endophyticus will help us understand its cooperative mechanism in the K. vulgare-Bacillus strain consortium to improve the fermentation of vitamin C.


July 7, 2019  |  

Genome sequence analysis of the naphthenic acid degrading and metal resistant bacterium Cupriavidus gilardii CR3.

Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals.


July 7, 2019  |  

Complex population structure and virulence differences among serotype 2 Streptococcus suis strains belonging to sequence type 28.

Streptococcus suis is a major swine pathogen and a zoonotic agent. Serotype 2 strains are the most frequently associated with disease. However, not all serotype 2 lineages are considered virulent. Indeed, sequence type (ST) 28 serotype 2 S. suis strains have been described as a homogeneous group of low virulence. However, ST28 strains are often isolated from diseased swine in some countries, and at least four human ST28 cases have been reported. Here, we used whole-genome sequencing and animal infection models to test the hypothesis that the ST28 lineage comprises strains of different genetic backgrounds and different virulence. We used 50 S. suis ST28 strains isolated in Canada, the United States and Japan from diseased pigs, and one ST28 strain from a human case isolated in Thailand. We report a complex population structure among the 51 ST28 strains. Diversity resulted from variable gene content, recombination events and numerous genome-wide polymorphisms not attributable to recombination. Phylogenetic analysis using core genome single-nucleotide polymorphisms revealed four discrete clades with strong geographic structure, and a fifth clade formed by US, Thai and Japanese strains. When tested in experimental animal models, strains from this latter clade were significantly more virulent than a Canadian ST28 reference strain, and a closely related Canadian strain. Our results highlight the limitations of MLST for both phylogenetic analysis and virulence prediction and raise concerns about the possible emergence of ST28 strains in human clinical cases.


July 7, 2019  |  

Genomic epidemiology of an endoscope-associated outbreak of Klebsiella pneumoniae carbapenemase (KPC)-producing K. pneumoniae.

Increased incidence of infections due to Klebsiella pneumoniae carbapenemase (KPC)-producing Klebsiella pneumoniae (KPC-Kp) was noted among patients undergoing endoscopic retrograde cholangiopancreatography (ERCP) at a single hospital. An epidemiologic investigation identified KPC-Kp and non-KPC-producing, extended-spectrum ß-lactamase (ESBL)-producing Kp in cultures from 2 endoscopes. Genotyping was performed on patient and endoscope isolates to characterize the microbial genomics of the outbreak. Genetic similarity of 51 Kp isolates from 37 patients and 3 endoscopes was assessed by pulsed-field gel electrophoresis (PFGE) and multi-locus sequence typing (MLST). Five patient and 2 endoscope isolates underwent whole genome sequencing (WGS). Two KPC-encoding plasmids were characterized by single molecule, real-time sequencing. Plasmid diversity was assessed by endonuclease digestion. Genomic and epidemiologic data were used in conjunction to investigate the outbreak source. Two clusters of Kp patient isolates were genetically related to endoscope isolates by PFGE. A subset of patient isolates were collected post-ERCP, suggesting ERCP endoscopes as a possible source. A phylogeny of 7 Kp genomes from patient and endoscope isolates supported ERCP as a potential source of transmission. Differences in gene content defined 5 ST258 subclades and identified 2 of the subclades as outbreak-associated. A novel KPC-encoding plasmid, pKp28 helped define and track one endoscope-associated ST258 subclade. WGS demonstrated high genetic relatedness of patient and ERCP endoscope isolates suggesting ERCP-associated transmission of ST258 KPC-Kp. Gene and plasmid content discriminated the outbreak from endemic ST258 populations and assisted with the molecular epidemiologic investigation of an extended KPC-Kp outbreak.


July 7, 2019  |  

Evaluation and validation of assembling corrected PacBio long reads for microbial genome completion via hybrid approaches.

Despite the ever-increasing output of next-generation sequencing data along with developing assemblers, dozens to hundreds of gaps still exist in de novo microbial assemblies due to uneven coverage and large genomic repeats. Third-generation single-molecule, real-time (SMRT) sequencing technology avoids amplification artifacts and generates kilobase-long reads with the potential to complete microbial genome assembly. However, due to the low accuracy (~85%) of third-generation sequences, a considerable amount of long reads (>50X) are required for self-correction and for subsequent de novo assembly. Recently-developed hybrid approaches, using next-generation sequencing data and as few as 5X long reads, have been proposed to improve the completeness of microbial assembly. In this study we have evaluated the contemporary hybrid approaches and demonstrated that assembling corrected long reads (by runCA) produced the best assembly compared to long-read scaffolding (e.g., AHA, Cerulean and SSPACE-LongRead) and gap-filling (SPAdes). For generating corrected long reads, we further examined long-read correction tools, such as ECTools, LSC, LoRDEC, PBcR pipeline and proovread. We have demonstrated that three microbial genomes including Escherichia coli K12 MG1655, Meiothermus ruber DSM1279 and Pdeobacter heparinus DSM2366 were successfully hybrid assembled by runCA into near-perfect assemblies using ECTools-corrected long reads. In addition, we developed a tool, Patch, which implements corrected long reads and pre-assembled contigs as inputs, to enhance microbial genome assemblies. With the additional 20X long reads, short reads of S. cerevisiae W303 were hybrid assembled into 115 contigs using the verified strategy, ECTools + runCA. Patch was subsequently applied to upgrade the assembly to a 35-contig draft genome. Our evaluation of the hybrid approaches shows that assembling the ECTools-corrected long reads via runCA generates near complete microbial genomes, suggesting that genome assembly could benefit from re-analyzing the available hybrid datasets that were not assembled in an optimal fashion.


July 7, 2019  |  

The genome of the anaerobic fungus Orpinomyces sp. strain C1A reveals the unique evolutionary history of a remarkable plant biomass degrader.

Anaerobic gut fungi represent a distinct early-branching fungal phylum (Neocallimastigomycota) and reside in the rumen, hindgut, and feces of ruminant and nonruminant herbivores. The genome of an anaerobic fungal isolate, Orpinomyces sp. strain C1A, was sequenced using a combination of Illumina and PacBio single-molecule real-time (SMRT) technologies. The large genome (100.95 Mb, 16,347 genes) displayed extremely low G+C content (17.0%), large noncoding intergenic regions (73.1%), proliferation of microsatellite repeats (4.9%), and multiple gene duplications. Comparative genomic analysis identified multiple genes and pathways that are absent in Dikarya genomes but present in early-branching fungal lineages and/or nonfungal Opisthokonta. These included genes for posttranslational fucosylation, the production of specific intramembrane proteases and extracellular protease inhibitors, the formation of a complete axoneme and intraflagellar trafficking machinery, and a near-complete focal adhesion machinery. Analysis of the lignocellulolytic machinery in the C1A genome revealed an extremely rich repertoire, with evidence of horizontal gene acquisition from multiple bacterial lineages. Experimental analysis indicated that strain C1A is a remarkable biomass degrader, capable of simultaneous saccharification and fermentation of the cellulosic and hemicellulosic fractions in multiple untreated grasses and crop residues examined, with the process significantly enhanced by mild pretreatments. This capability, acquired during its separate evolutionary trajectory in the rumen, along with its resilience and invasiveness compared to prokaryotic anaerobes, renders anaerobic fungi promising agents for consolidated bioprocessing schemes in biofuels production.


July 7, 2019  |  

Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans.

Haplogroup H dominates present-day Western European mitochondrial DNA variability (>40%), yet was less common (~19%) among Early Neolithic farmers (~5450 BC) and virtually absent in Mesolithic hunter-gatherers. Here we investigate this major component of the maternal population history of modern Europeans and sequence 39 complete haplogroup H mitochondrial genomes from ancient human remains. We then compare this ‘real-time’ genetic data with cultural changes taking place between the Early Neolithic (~5450 BC) and Bronze Age (~2200 BC) in Central Europe. Our results reveal that the current diversity and distribution of haplogroup H were largely established by the Mid Neolithic (~4000 BC), but with substantial genetic contributions from subsequent pan-European cultures such as the Bell Beakers expanding out of Iberia in the Late Neolithic (~2800 BC). Dated haplogroup H genomes allow us to reconstruct the recent evolutionary history of haplogroup H and reveal a mutation rate 45% higher than current estimates for human mitochondria.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.