Menu
June 1, 2021

Highly contiguous de novo human genome assembly and long-range haplotype phasing using SMRT Sequencing

The long reads, random error, and unbiased sampling of SMRT Sequencing enables high quality, de novo assembly of the human genome. PacBio long reads are capable of resolving genomic variations at all size scales, including SNPs, insertions, deletions, inversions, translocations, and repeat expansions, all of which are both important in understanding the genetic basis for human disease, and difficult to access via other technologies. In demonstration of this, we report a new high-quality, diploid-aware de novo assembly of Craig Venter’s well-studied genome.


June 1, 2021

Multiplexing strategies for microbial whole genome SMRT Sequencing

As the throughput of the PacBio Systems continues to increase, so has the desire to fully utilize SMRT Cell sequencing capacity to multiplex microbes for whole genome sequencing. Multiplexing is readily achieved by incorporating a unique barcode for each microbe into the SMRTbell adapters and using a streamlined library preparation process. Incorporating barcodes without PCR amplification prevents the loss of epigenetic information and the generation of chimeric sequences, while eliminating the need to generate separate SMRTbell libraries. We multiplexed the genomes of up to 8 unique strains of H. pylori. Each genome was sheared and processed through adapter ligation in a single, addition-only reaction. The barcoded samples were pooled in equimolar quantities and a single SMRTbell library was prepared. We demonstrate successful de novo microbial assembly from all multiplexes tested (2- through 8-plex) using data generated from a single SMRTbell library, run on a single SMRT Cell with the PacBio RS II, and analyzed with standard SMRT Analysis assembly methods. This strategy was successful using both small (1.6 Mb, H. pylori) and medium (5 Mb, E. coli) genomes. This protocol facilitates the sequencing of multiple microbial genomes in a single run, greatly increasing throughput and reducing costs per genome.


June 1, 2021

Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome using long-read sequencing

Sequence-based estimation of genetic diversity of Plasmodium falciparum, the most lethal malarial parasite, has proved challenging due to a lack of a complete genomic assembly. The skewed AT-richness (~80.6% (A+T)) of its genome and the lack of technology to assemble highly polymorphic sub-telomeric regions that contain clonally variant, multigene virulence families (i.e. var and rifin) have confounded attempts using short-read NGS technologies. Using single molecule, real-time (SMRT) sequencing, we successfully compiled all 14 nuclear chromosomes of the P. falciparum genome from telomere-to-telomere in single contigs. Specifically, amplification-free sequencing generated reads of average length 12 kb, with =50% of the reads between 15.5 and 50 kb in length. A hierarchical genome assembly process (HGAP), was used to assemble the P. falciparum genome de novo. This assembly accurately resolved centromeres (~90-99% (A+T)) and sub-telomeric regions, and identified large insertions and duplications in the genome that added extra genes to the var and rifin virulence families, along with smaller structural variants such as homopolymer tract expansions. These regions can be used as markers for genetic diversity during comparative genome analyses. Moreover, identifying the polymorphic and repetitive sub-telomeric sequences of parasite populations from endemic areas might inform the link between structural variation and phenotypes such as virulence, drug resistance and disease transmission.


June 1, 2021

Multiplexed complete microbial genomes on the Sequel System

Microbes play an important role in nearly every part of our world, as they affect human health, our environment, agriculture, and aid in waste management. Complete closed genome sequences, which have become the gold standard with PacBio long-read sequencing, can be key to understanding microbial functional characteristics. However, input requirements, consumables costs, and the labor required to prepare and sequence a microbial genome have in the past put PacBio sequencing out of reach for some larger projects. We have developed a multiplexed library prep approach that is simple, fast, and cost-effective, and can produce 4 to 16 closed bacterial genomes from one Sequel SMRT Cell. Additionally, we are introducing a streamlined analysis pipeline for processing multiplexed genome sequence data through de novo HGAP assembly, making the entire process easy for lab personnel to perform. Here we present the entire workflow from shearing through assembly, with times for each step. We show HGAP assembly results with single or very few contigs from bacteria from different size genomes, sequenced without or with size selection. These data illustrate the benefits and potential of the PacBio multiplexed library prep and the Sequel System for sequencing large numbers of microbial genomes.


June 1, 2021

Best practices for whole genome sequencing using the Sequel System

Plant and animal whole genome sequencing has proven to be challenging, particularly due to genome size, high density of repetitive elements and heterozygosity. The Sequel System delivers long reads, high consensus accuracy and uniform coverage, enabling more complete, accurate, and contiguous assemblies of these large complex genomes. The latest Sequel chemistry increases yield up to 8 Gb per SMRT Cell for long insert libraries >20 kb and up to 10 Gb per SMRT Cell for libraries >40 kb. In addition, the recently released SMRTbell Express Template Prep Kit reduces the time (~3 hours) and DNA input (~3 µg), making the workflow easy to use for multi- SMRT Cell projects. Here, we recommend the best practices for whole genome sequencing and de novo assembly of complex plant and animal genomes. Guidelines for constructing large-insert SMRTbell libraries (>30 kb) to generate optimal read lengths and yields using the latest Sequel chemistry are presented. We also describe ways to maximize library yield per preparation from as littles as 3 µg of sheared genomic DNA. The combination of these advances makes plant and animal whole genome sequencing a practical application of the Sequel System.


April 21, 2020

Complete genome sequence of Marinobacter sp. LQ44, a haloalkaliphilic phenol-degrading bacterium isolated from a deep-sea hydrothermal vent

Marinobacter sp. strain LQ44, an alkaliphile and moderate halophile from a deep-sea hydrothermal vent on the East Pacific Rise, is a novel phenol-degrading bacterium that is capable of utilizing phenol as sole carbon and energy sources. Here, we present the complete genome sequence of strain LQ44, which consists of 4,435,564?bp with a circular chromosome, 4164 protein-coding genes, 3 rRNA operons and 50 tRNAs. Genome analysis revealed that strain LQ44 may degrade phenol via meta-cleavage pathway. The LQ44 genome contains multiple genes involved in pH adaptation and osmotic adjustment. Genes related to hydrocarbon degradation, aerobic denitrification and potential industrial important enzymes were also identified from the genome. To our knowledge, this is the first report of a genome sequence of a haloalkaliphilic phenol-degrading bacterium, which will provide insights into the survival of this bacterium under salt-alkali conditions and the potential for biotechnological applications.


April 21, 2020

ASA3P: An automatic and scalable pipeline for the assembly, annotation and higher level analysis of closely related bacterial isolates

Whole genome sequencing of bacteria has become daily routine in many fields. Advances in DNA sequencing technologies and continuously dropping costs have resulted in a tremendous increase in the amounts of available sequence data. However, comprehensive in-depth analysis of the resulting data remains an arduous and time consuming task. In order to keep pace with these promising but challenging developments and to transform raw data into valuable information, standardized analyses and scalable software tools are needed. Here, we introduce ASA3P, a fully automatic, locally executable and scalable assembly, annotation and analysis pipeline for bacterial genomes. The pipeline automatically executes necessary data processing steps, i.e. quality clipping and assembly of raw sequencing reads, scaffolding of contigs and annotation of the resulting genome sequences. Furthermore, ASA3P conducts comprehensive genome characterizations and analyses, e.g. taxonomic classification, detection of antibiotic resistance genes and identification of virulence factors. All results are presented via an HTML5 user interface providing aggregated information, interactive visualizations and access to intermediate results in standard bioinformatics file formats. We distribute ASA3P in two versions: a locally executable Docker container for small-to-medium-scale projects and an OpenStack based cloud computing version able to automatically create and manage self-scaling compute clusters. Thus, automatic and standardized analysis of hundreds of bacterial genomes becomes feasible within hours. The software and further information is available at: http://asap.computational.bio.


April 21, 2020

Cupriavidus sp. strain Ni-2 resistant to high concentration of nickel and its genes responsible for the tolerance by genome comparison.

The widespread use of metals influenced many researchers to examine the relationship between heavy metal toxicity and bacterial resistance. In this study, we have inoculated heavy metal-contaminated soil from Janghang region of South Korea in the nickel-containing media (20 mM Ni2+) for the enrichment. Among dozens of the colonies acquired from the several transfers and serial dilutions with the same concentrations of Ni, the strain Ni-2 was chosen for further studies. The isolates were identified for their phylogenetic affiliations using 16S rRNA gene analysis. The strain Ni-2 was close to Cupriavidus metallidurans and was found to be resistant to antibiotics of vancomycin, erythromycin, chloramphenicol, ampicillin, gentamicin, streptomycin, and kanamycin by disk diffusion method. Of the isolated strains, Ni-2 was sequenced for the whole genome, since the Ni-resistance seemed to be better than the other strains. From the genome sequence we have found that there was a total of 89 metal-resistance-related genes including 11 Ni-resistance genes, 41 heavy metal (As, Cd, Zn, Hg, Cu, and Co)-resistance genes, 22 cation-efflux genes, 4 metal pumping ATPase genes, and 11 metal transporter genes.


April 21, 2020

Complete genome sequence of Paracoccus sp. Arc7-R13, a silver nanoparticles synthesizing bacterium isolated from Arctic Ocean sediments

Paracoccus sp. Arc7-R13, a silver nanoparticles (AgNPs) synthesizing bacterium, was isolated from Arctic Ocean sediment. Here we describe the complete genome of Paracoccus sp. Arc7-R13. The complete genome contains 4,040,012?bp with 66.66?mol%?G?+?C content, including one circular chromosome of 3,231,929?bp (67.45?mol%?G?+?C content), and eight plasmids with length ranging from 24,536?bp to 199,685?bp. The genome contains 3835 protein-coding genes (CDSs), 49 tRNA genes, as well as 3 rRNA operons as 16S-23S-5S rRNA. Based on the gene annotation and Swiss-Prot analysis, a total of 15 genes belonging to 11 kinds, including silver exporting P-type ATPase (SilP), alkaline phosphatase, nitroreductase, thioredoxin reductase, NADPH dehydrogenase and glutathione peroxidase, might be related to the synthesis of AgNPs. Meanwhile, many additional genes associated with synthesis of AgNPs such as protein-disulfide isomerase, c-type cytochrome, glutathione synthase and dehydrogenase reductase were also identified.


April 21, 2020

Comparative Genomic Analysis of Virulence, Antimicrobial Resistance, and Plasmid Profiles of Salmonella Dublin Isolated from Sick Cattle, Retail Beef, and Humans in the United States.

Salmonella enterica serovar Dublin is a host-adapted serotype associated with typhoidal disease in cattle. While rare in humans, it usually causes severe illness, including bacteremia. In the United States, Salmonella Dublin has become one of the most multidrug-resistant (MDR) serotypes. To understand the genetic elements that are associated with virulence and resistance, we sequenced 61 isolates of Salmonella Dublin (49 from sick cattle and 12 from retail beef) using the Illumina MiSeq and closed 5 genomes using the PacBio sequencing platform. Genomic data of eight human isolates were also downloaded from NCBI (National Center for Biotechnology Information) for comparative analysis. Fifteen Salmonella pathogenicity islands (SPIs) and a spv operon (spvRABCD), which encodes important virulence factors, were identified in all 69 (100%) isolates. The 15 SPIs were located on the chromosome of the 5 closed genomes, with each of these isolates also carrying 1 or 2 plasmids with sizes between 36 and 329?kb. Multiple antimicrobial resistance genes (ARGs), including blaCMY-2, blaTEM-1B, aadA12, aph(3′)-Ia, aph(3′)-Ic, strA, strB, floR, sul1, sul2, and tet(A), along with spv operons were identified on these plasmids. Comprehensive antimicrobial resistance genotypes were determined, including 17 genes encoding resistance to 5 different classes of antimicrobials, and mutations in the housekeeping gene (gyrA) associated with resistance or decreased susceptibility to fluoroquinolones. Together these data revealed that this panel of Salmonella Dublin commonly carried 15 SPIs, MDR/virulence plasmids, and ARGs against several classes of antimicrobials. Such genomic elements may make important contributions to the severity of disease and treatment failures in Salmonella Dublin infections in both humans and cattle.


April 21, 2020

Phylogenetic reconciliation reveals the natural history of glycopeptide antibiotic biosynthesis and resistance.

Glycopeptide antibiotics are produced by Actinobacteria through biosynthetic gene clusters that include genes supporting their regulation, synthesis, export and resistance. The chemical and biosynthetic diversities of glycopeptides are the product of an intricate evolutionary history. Extracting this history from genome sequences is difficult as conservation of the individual components of these gene clusters is variable and each component can have a different trajectory. We show that glycopeptide biosynthesis and resistance in Actinobacteria maps to approximately 150-400 million years ago. Phylogenetic reconciliation reveals that the precursors of glycopeptide biosynthesis are far older than other components, implying that these clusters arose from a pre-existing pool of genes. We find that resistance appeared contemporaneously with biosynthetic genes, raising the possibility that the mechanism of action of glycopeptides was a driver of diversification in these gene clusters. Our results put antibiotic biosynthesis and resistance into an evolutionary context and can guide the future discovery of compounds possessing new mechanisms of action, which are especially needed as the usefulness of the antibiotics available at present is imperilled by human activity.


April 21, 2020

A putative microcin amplifies Shiga toxin 2a production of Escherichia coli O157: H7

Escherichia coli O157:H7 is a foodborne pathogen, implicated in various multi-state outbreaks. It encodes Shiga toxin on a prophage, and Shiga toxin production is linked to phage induction. An E. coli strain, designated 0.1229, was identified that amplified Stx2a production when co-cultured with E. coli O157:H7 strain PA2. Growth of PA2 in 0.1229 cell-free supernatants had a similar effect, even when supernatants were heated to 100°C for 10 min, but not after treatment with Proteinase K. The secreted molecule was shown to use TolC for export and the TonB system for import. The genes sufficient for production of this molecule were localized to a 5.2 kb region of a 12.8 kb plasmid. This region was annotated, identifying hypothetical proteins, a predicted ABC transporter, and a cupin superfamily protein. These genes were identified and shown to be functional in two other E. coli strains, and bioinformatic analyses identified related gene clusters in similar and distinct bacterial species. These data collectively suggest E. coli 0.1229 and other E. coli produce a microcin that induces the SOS response in target bacteria. Besides adding to the limited number of microcins known to be produced by E. coli, this study provides an additional mechanism by which stx2a expression is increased in response to the gut microflora.


April 21, 2020

A megaplasmid family responsible for dissemination of multidrug resistance in Pseudomonas

Multidrug resistance (MDR) represents a global threat to health. Although plasmids can play an important role in the dissemination of MDR, they have not been commonly linked to the emergence of antimicrobial resistance in the pathogen Pseudomonas aeruginosa. We used whole genome sequencing to characterize a collection of P. aeruginosa clinical isolates from a hospital in Thailand. Using long-read sequence data we obtained complete sequences of two closely related megaplasmids (>420 kb) carrying large arrays of antibiotic resistance genes located in discrete, complex and dynamic resistance regions, and revealing evidence of extensive duplication and recombination events. A comprehensive pangenomic and phylogenomic analysis indicated that 1) these large plasmids comprise a family present in different members of the Pseudomonas genus and associated with multiple sources (geographical, clinical or environmental); 2) the megaplasmids encode diverse niche-adaptive accessory traits, including multidrug resistance; 3) the pangenome of the megaplasmid family is highly flexible and diverse, comprising a substantial core genome (average of 48% of plasmid genes), but with individual members carrying large numbers of unique genes. The history of the megaplasmid family, inferred from our analysis of the available database, suggests that members carrying multiple resistance genes date back to at least the 1970s.


April 21, 2020

Convergent evolution of linked mating-type loci in basidiomycetes: an ancient fusion event that has stood the test of time

Sexual development is a key evolutionary innovation of eukaryotes. In many species, mating involves interaction between compatible mating partners that can undergo cell and nuclear fusion and subsequent steps of development including meiosis. Mating compatibility in fungi is governed by mating type determinants, which are localized at mating type (MAT) loci. In basidiomycetes, the ancestral state is hypothesized to be tetrapolar (bifactorial), with two genetically unlinked MAT loci containing homeodomain transcription factor genes (HD locus) and pheromone and pheromone receptor genes (P/R locus), respectively. Alleles at both loci must differ between mating partners for completion of sexual development. However, there are also basidiomycete species with bipolar (unifactorial) mating systems, which can arise through genomic linkage of the HD and P/R loci. In the order Tremellales, which is comprised of mostly yeast-like species, bipolarity is found only in the human pathogenic Cryptococcus species. Here, we describe the analysis of MAT loci from the Trichosporonales, a sister order to the Tremellales. We analyzed genome sequences from 29 strains that belong to 24 species, including two new genome sequences generated in this study. Interestingly, in all of the species analyzed, the MAT loci are fused and a single HD gene is present in each mating type. This is similar to the organization in the pathogenic Cryptococci, which also have linked MAT loci and carry only one HD gene per MAT locus instead of the usual two HD genes found in the vast majority of basidiomycetes. However, the HD and P/R allele combinations in the Trichosporonales are different from those in the pathogenic Cryptococcus species. The differences in allele combinations compared to the bipolar Cryptococci as well as the existence of tetrapolar Tremellales sister species suggest that fusion of the HD and P/R loci and differential loss of one of the two HD genes per MAT allele occurred independently in the Trichosporonales and pathogenic Cryptococci. This finding supports the hypothesis of convergent evolution at the molecular level towards fused mating-type regions in fungi, similar to previous findings in other fungal groups. Unlike the fused MAT loci in several other basidiomycete lineages though, the gene content and gene order within the fused MAT loci are highly conserved in the Trichosporonales, and there is no apparent suppression of recombination extending from the MAT loci to adjacent chromosomal regions, suggesting different mechanisms for the evolution of physically linked MAT loci in these groups.


April 21, 2020

Identification and characterization of OmpT-like proteases in uropathogenic Escherichia coli clinical isolates

Bacterial colonization of the urogenital tract is limited by innate defenses, including the production of antimicrobial peptides (AMPs). Uropathogenic Escherichia coli (UPEC) resist AMP-killing to cause a range of urinary tract infections (UTIs) including asymptomatic bacteriuria, cystitis, pyelonephritis, and sepsis. UPEC strains have high genomic diversity and encode numerous virulence factors that differentiate them from non-UTI causing strains, including ompT. As OmpT homologues cleave and inactivate AMPs, we hypothesized that high OmpT protease activity-levels contribute to UPEC colonization during symptomatic UTIs. Therefore, we measured OmpT activity in 58 UPEC clinical isolates. While heterogeneous OmpT activities were observed, OmpT activity was significantly greater in UPEC strains isolated from patients with symptomatic infections. Unexpectedly, UPEC strains exhibiting the greatest protease activities harboured an additional ompT-like gene called arlC (ompTp). The presence of two OmpT-like proteases in some UPEC isolates led us to compare the substrate specificities of OmpT-like proteases found in E. coli. While all three cleaved AMPs, cleavage efficiency varied on the basis of AMP size and secondary structure. Our findings suggest the presence ArlC and OmpT in the same UPEC isolate may confer a fitness advantage by expanding the range of target substrates.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.