Menu
July 7, 2019

Feasibility of real time next generation sequencing of cancer genes linked to drug response: results from a clinical trial.

The successes of targeted drugs with companion predictive biomarkers and the technological advances in gene sequencing have generated enthusiasm for evaluating personalized cancer medicine strategies using genomic profiling. We assessed the feasibility of incorporating real-time analysis of somatic mutations within exons of 19 genes into patient management. Blood, tumor biopsy and archived tumor samples were collected from 50 patients recruited from four cancer centers. Samples were analyzed using three technologies: targeted exon sequencing using Pacific Biosciences PacBio RS, multiplex somatic mutation genotyping using Sequenom MassARRAY and Sanger sequencing. An expert panel reviewed results prior to reporting to clinicians. A clinical laboratory verified actionable mutations. Fifty patients were recruited. Nineteen actionable mutations were identified in 16 (32%) patients. Across technologies, results were in agreement in 100% of biopsy specimens and 95% of archival specimens. Profiling results from paired archival/biopsy specimens were concordant in 30/34 (88%) patients. We demonstrated that the use of next generation sequencing for real-time genomic profiling in advanced cancer patients is feasible. Additionally, actionable mutations identified in this study were relatively stable between archival and biopsy samples, implying that cancer mutations that are good predictors of drug response may remain constant across clinical stages. Copyright © 2012 UICC.


July 7, 2019

Haplotype assembly in polyploid genomes and identical by descent shared tracts.

Genome-wide haplotype reconstruction from sequence data, or haplotype assembly, is at the center of major challenges in molecular biology and life sciences. For complex eukaryotic organisms like humans, the genome is vast and the population samples are growing so rapidly that algorithms processing high-throughput sequencing data must scale favorably in terms of both accuracy and computational efficiency. Furthermore, current models and methodologies for haplotype assembly (i) do not consider individuals sharing haplotypes jointly, which reduces the size and accuracy of assembled haplotypes, and (ii) are unable to model genomes having more than two sets of homologous chromosomes (polyploidy). Polyploid organisms are increasingly becoming the target of many research groups interested in the genomics of disease, phylogenetics, botany and evolution but there is an absence of theory and methods for polyploid haplotype reconstruction.In this work, we present a number of results, extensions and generalizations of compass graphs and our HapCompass framework. We prove the theoretical complexity of two haplotype assembly optimizations, thereby motivating the use of heuristics. Furthermore, we present graph theory-based algorithms for the problem of haplotype assembly using our previously developed HapCompass framework for (i) novel implementations of haplotype assembly optimizations (minimum error correction), (ii) assembly of a pair of individuals sharing a haplotype tract identical by descent and (iii) assembly of polyploid genomes. We evaluate our methods on 1000 Genomes Project, Pacific Biosciences and simulated sequence data.HapCompass is available for download at http://www.brown.edu/Research/Istrail_Lab/.Supplementary data are available at Bioinformatics online.


July 7, 2019

Combining de novo and reference-guided assembly with scaffold_builder.

Genome sequencing has become routine, however genome assembly still remains a challenge despite the computational advances in the last decade. In particular, the abundance of repeat elements in genomes makes it difficult to assemble them into a single complete sequence. Identical repeats shorter than the average read length can generally be assembled without issue. However, longer repeats such as ribosomal RNA operons cannot be accurately assembled using existing tools. The application Scaffold_builder was designed to generate scaffolds – super contigs of sequences joined by N-bases – based on the similarity to a closely related reference sequence. This is independent of mate-pair information and can be used complementarily for genome assembly, e.g. when mate-pairs are not available or have already been exploited. Scaffold_builder was evaluated using simulated pyrosequencing reads of the bacterial genomes Escherichia coli 042, Lactobacillus salivarius UCC118 and Salmonella enterica subsp. enterica serovar Typhi str. P-stx-12. Moreover, we sequenced two genomes from Salmonella enterica serovar Typhimurium LT2 G455 and Salmonella enterica serovar Typhimurium SDT1291 and show that Scaffold_builder decreases the number of contig sequences by 53% while more than doubling their average length. Scaffold_builder is written in Python and is available at http://edwards.sdsu.edu/scaffold_builder. A web-based implementation is additionally provided to allow users to submit a reference genome and a set of contigs to be scaffolded.


July 7, 2019

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species.

The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly.In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies.Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.


July 7, 2019

Complete genome sequence of the Mesoplasma florum W37 strain.

Mesoplasma florum is a small-genome fast-growing mollicute that is an attractive model for systems and synthetic genomics studies. We report the complete 825,824-bp genome sequence of a second representative of this species, M. florum strain W37, which contains 733 predicted open reading frames and 35 stable RNAs.


July 7, 2019

Enhanced 5-methylcytosine detection in single-molecule, real-time sequencing via Tet1 oxidation.

DNA methylation serves as an important epigenetic mark in both eukaryotic and prokaryotic organisms. In eukaryotes, the most common epigenetic mark is 5-methylcytosine, whereas prokaryotes can have 6-methyladenine, 4-methylcytosine, or 5-methylcytosine. Single-molecule, real-time sequencing is capable of directly detecting all three types of modified bases. However, the kinetic signature of 5-methylcytosine is subtle, which presents a challenge for detection. We investigated whether conversion of 5-methylcytosine to 5-carboxylcytosine using the enzyme Tet1 would enhance the kinetic signature, thereby improving detection.We characterized the kinetic signatures of various cytosine modifications, demonstrating that 5-carboxylcytosine has a larger impact on the local polymerase rate than 5-methylcytosine. Using Tet1-mediated conversion, we show improved detection of 5-methylcytosine using in vitro methylated templates and apply the method to the characterization of 5-methylcytosine sites in the genomes of Escherichia coli MG1655 and Bacillus halodurans C-125.We have developed a method for the enhancement of directly detecting 5-methylcytosine during single-molecule, real-time sequencing. Using Tet1 to convert 5-methylcytosine to 5-carboxylcytosine improves the detection rate of this important epigenetic marker, thereby complementing the set of readily detectable microbial base modifications, and enhancing the ability to interrogate eukaryotic epigenetic markers.


July 7, 2019

Complete genome sequence of a multidrug-resistant Salmonella enterica serovar Typhimurium var. 5- strain isolated from chicken breast.

Salmonella enterica subsp. enterica serovar Typhimurium is a leading cause of salmonellosis. Here, we report a closed genome sequence, including sequences of 3 plasmids, of Salmonella serovar Typhimurium var. 5- CFSAN001921 (National Antimicrobial Resistance Monitoring System [NARMS] strain ID N30688), which was isolated from chicken breast meat and shows resistance to 10 different antimicrobials. Whole-genome and plasmid sequence analyses of this isolate will help enhance our understanding of this pathogenic multidrug-resistant serovar.


July 7, 2019

Genome sequence of Phaeobacter daeponensis type strain (DSM 23529(T)), a facultatively anaerobic bacterium isolated from marine sediment, and emendation of Phaeobacter daeponensis.

TF-218(T) is the type strain of the species Phaeobacter daeponensis Yoon et al. 2007, a facultatively anaerobic Phaeobacter species isolated from tidal flats. Here we describe the draft genome sequence and annotation of this bacterium together with previously unreported aspects of its phenotype. We analyzed the genome for genes involved in secondary metabolite production and its anaerobic lifestyle, which have also been described for its closest relative Phaeobacter caeruleus. The 4,642,596 bp long genome of strain TF-218(T) contains 4,310 protein-coding genes and 78 RNA genes including four rRNA operons and consists of five replicons: one chromosome and four extrachromosomal elements with sizes of 276 kb, 174 kb, 117 kb and 90 kb. Genome analysis showed that TF-218(T) possesses all of the genes for indigoidine biosynthesis, and on specific media the strain showed a blue pigmentation. We also found genes for dissimilatory nitrate reduction, gene-transfer agents, NRPS/ PKS genes and signaling systems homologous to the LuxR/I system.


July 7, 2019

Genome of an arbuscular mycorrhizal fungus provides insight into the oldest plant symbiosis.

The mutualistic symbiosis involving Glomeromycota, a distinctive phylum of early diverging Fungi, is widely hypothesized to have promoted the evolution of land plants during the middle Paleozoic. These arbuscular mycorrhizal fungi (AMF) perform vital functions in the phosphorus cycle that are fundamental to sustainable crop plant productivity. The unusual biological features of AMF have long fascinated evolutionary biologists. The coenocytic hyphae host a community of hundreds of nuclei and reproduce clonally through large multinucleated spores. It has been suggested that the AMF maintain a stable assemblage of several different genomes during the life cycle, but this genomic organization has been questioned. Here we introduce the 153-Mb haploid genome of Rhizophagus irregularis and its repertoire of 28,232 genes. The observed low level of genome polymorphism (0.43 SNP per kb) is not consistent with the occurrence of multiple, highly diverged genomes. The expansion of mating-related genes suggests the existence of cryptic sex-related processes. A comparison of gene categories confirms that R. irregularis is close to the Mucoromycotina. The AMF obligate biotrophy is not explained by genome erosion or any related loss of metabolic complexity in central metabolism, but is marked by a lack of genes encoding plant cell wall-degrading enzymes and of genes involved in toxin and thiamine synthesis. A battery of mycorrhiza-induced secreted proteins is expressed in symbiotic tissues. The present comprehensive repertoire of R. irregularis genes provides a basis for future research on symbiosis-related mechanisms in Glomeromycota.


July 7, 2019

Precise breakpoint localization of large genomic deletions using PacBio and Illumina next-generation sequencers.

Herein we present the applicability of single-molecule (PacBio RS) and second-generation sequencing technology (Illumina) to the characterization of large genomic deletions. By testing samples previously characterized using a Sanger approach, our methods determined that both next-generation sequencing platforms were able to identify the position of deletion breakpoints. Our results point out various advantages of next-generation sequencing platforms when characterizing genomic deletions; however, special attention must be dedicated to identical sequences flanking the breakpoints, such as poly(N) motifs.


July 7, 2019

Complete genome sequence of Staphylococcus aureus Z172, a vancomycin-intermediate and daptomycin-nonsusceptible methicillin-resistant strain isolated in Taiwan.

We report the complete genome sequence of Z172, a representative strain of sequence type 239-staphylococcal cassette chromosome mec type III (ST239-SCCmec type III) hospital-associated methicillin-resistant Staphylococcus aureus in Taiwan. Strain Z172 also exhibits a vancomycin-intermediate and daptomycin-nonsusceptible phenotype.


July 7, 2019

PBSIM: PacBio reads simulator–toward accurate genome assembly.

PacBio sequencers produce two types of characteristic reads (continuous long reads: long and high error rate and circular consensus sequencing: short and low error rate), both of which could be useful for de novo assembly of genomes. Currently, there is no available simulator that targets the specific generation of PacBio libraries.Our analysis of 13 PacBio datasets showed characteristic features of PacBio reads (e.g. the read length of PacBio reads follows a log-normal distribution). We have developed a read simulator, PBSIM, that captures these features using either a model-based or sampling-based method. Using PBSIM, we conducted several hybrid error correction and assembly tests for PacBio reads, suggesting that a continuous long reads coverage depth of at least 15 in combination with a circular consensus sequencing coverage depth of at least 30 achieved extensive assembly results.PBSIM is freely available from the web under the GNU GPL v2 license (http://code.google.com/p/pbsim/).


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.