Menu
July 19, 2019  |  

De novo assembly of haplotype-resolved genomes with trio binning.

Complex allelic variation hampers the assembly of haplotype-resolved sequences from diploid genomes. We developed trio binning, an approach that simplifies haplotype assembly by resolving allelic variation before assembly. In contrast with prior approaches, the effectiveness of our method improved with increasing heterozygosity. Trio binning uses short reads from two parental genomes to first partition long reads from an offspring into haplotype-specific sets. Each haplotype is then assembled independently, resulting in a complete diploid reconstruction. We used trio binning to recover both haplotypes of a diploid human genome and identified complex structural variants missed by alternative approaches. We sequenced an F1 cross between the cattle subspecies Bos taurus taurus and Bos taurus indicus and completely assembled both parental haplotypes with NG50 haplotig sizes of >20 Mb and 99.998% accuracy, surpassing the quality of current cattle reference genomes. We suggest that trio binning improves diploid genome assembly and will facilitate new studies of haplotype variation and inheritance.


July 19, 2019  |  

Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L.

Modern sugarcanes are polyploid interspecific hybrids, combining high sugar content from Saccharum officinarum with hardiness, disease resistance and ratooning of Saccharum spontaneum. Sequencing of a haploid S. spontaneum, AP85-441, facilitated the assembly of 32 pseudo-chromosomes comprising 8 homologous groups of 4 members each, bearing 35,525 genes with alleles defined. The reduction of basic chromosome number from 10 to 8 in S. spontaneum was caused by fissions of 2 ancestral chromosomes followed by translocations to 4 chromosomes. Surprisingly, 80% of nucleotide binding site-encoding genes associated with disease resistance are located in 4 rearranged chromosomes and 51% of those in rearranged regions. Resequencing of 64 S. spontaneum genomes identified balancing selection in rearranged regions, maintaining their diversity. Introgressed S. spontaneum chromosomes in modern sugarcanes are randomly distributed in AP85-441 genome, indicating random recombination among homologs in different S. spontaneum accessions. The allele-defined Saccharum genome offers new knowledge and resources to accelerate sugarcane improvement.


July 7, 2019  |  

Genetic stability of pneumococcal isolates during 35 days of human experimental carriage.

Pneumococcal carriage is a reservoir for transmission and a precursor to pneumococcal disease. The experimental human pneumococcal carriage model provides a useful tool to aid vaccine licensure through the measurement of vaccine efficacy against carriage (VEcol). Documentation of the genetic stability of the experimental human pneumococcal carriage model is important to further strengthen confidence in its safety and conclusions, enabling it to further facilitate vaccine licensure through providing evidence of VEcol.229 isolates were sequenced from 10 volunteers in whom experimental human pneumococcal carriage was established, sampled over a period of 35 days. Multiple isolates from within a single volunteer at a single time provided a deep resolution for detecting variation. HiSeq data from the isolates were mapped against a PacBio reference of the inoculum to call variable sites.The observed variation between experimental carriage isolates was minimal with the maximum SNP distance between any isolate and the reference being 3 SNPs.The low-level variation described provides evidence for the stability of the experimental human pneumococcal carriage model over 35 days, which can be reliably and confidently used to measure VEcol and aid future progression of pneumococcal vaccination. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.


July 7, 2019  |  

The Streptomyces leeuwenhoekii genome: de novo sequencing and assembly in single contigs of the chromosome, circular plasmid pSLE1 and linear plasmid pSLE2.

Next Generation DNA Sequencing (NGS) and genome mining of actinomycetes and other microorganisms is currently one of the most promising strategies for the discovery of novel bioactive natural products, potentially revealing novel chemistry and enzymology involved in their biosynthesis. This approach also allows rapid insights into the biosynthetic potential of microorganisms isolated from unexploited habitats and ecosystems, which in many cases may prove difficult to culture and manipulate in the laboratory. Streptomyces leeuwenhoekii (formerly Streptomyces sp. strain C34) was isolated from the hyper-arid high-altitude Atacama Desert in Chile and shown to produce novel polyketide antibiotics.Here we present the de novo sequencing of the S. leeuwenhoekii linear chromosome (8 Mb) and two extrachromosomal replicons, the circular pSLE1 (86 kb) and the linear pSLE2 (132 kb), all in single contigs, obtained by combining Pacific Biosciences SMRT (PacBio) and Illumina MiSeq technologies. We identified the biosynthetic gene clusters for chaxamycin, chaxalactin, hygromycin A and desferrioxamine E, metabolites all previously shown to be produced by this strain (J Nat Prod, 2011, 74:1965) and an additional 31 putative gene clusters for specialised metabolites. As well as gene clusters for polyketides and non-ribosomal peptides, we also identified three gene clusters encoding novel lasso-peptides.The S. leeuwenhoekii genome contains 35 gene clusters apparently encoding the biosynthesis of specialised metabolites, most of them completely novel and uncharacterised. This project has served to evaluate the current state of NGS for efficient and effective genome mining of high GC actinomycetes. The PacBio technology now permits the assembly of actinomycete replicons into single contigs with >99 % accuracy. The assembled Illumina sequence permitted not only the correction of omissions found in GC homopolymers in the PacBio assembly (exacerbated by the high GC content of actinomycete DNA) but it also allowed us to obtain the sequences of the termini of the chromosome and of a linear plasmid that were not assembled by PacBio. We propose an experimental pipeline that uses the Illumina assembled contigs, in addition to just the reads, to complement the current limitations of the PacBio sequencing technology and assembly software.


July 7, 2019  |  

Genome and transcriptome of the regeneration-competent flatworm, Macrostomum lignano.

The free-living flatworm, Macrostomum lignano has an impressive regenerative capacity. Following injury, it can regenerate almost an entirely new organism because of the presence of an abundant somatic stem cell population, the neoblasts. This set of unique properties makes many flatworms attractive organisms for studying the evolution of pathways involved in tissue self-renewal, cell-fate specification, and regeneration. The use of these organisms as models, however, is hampered by the lack of a well-assembled and annotated genome sequences, fundamental to modern genetic and molecular studies. Here we report the genomic sequence of M. lignano and an accompanying characterization of its transcriptome. The genome structure of M. lignano is remarkably complex, with ~75% of its sequence being comprised of simple repeats and transposon sequences. This has made high-quality assembly from Illumina reads alone impossible (N50 = 222 bp). We therefore generated 130× coverage by long sequencing reads from the Pacific Biosciences platform to create a substantially improved assembly with an N50 of 64 Kbp. We complemented the reference genome with an assembled and annotated transcriptome, and used both of these datasets in combination to probe gene-expression patterns during regeneration, examining pathways important to stem cell function.


July 7, 2019  |  

Methicillin-susceptible, vancomycin-resistant Staphylococcus aureus, Brazil.

We report characterization of a methicillin-susceptible, vancomycin-resistant bloodstream isolate of Staphylococcus aureus recovered from a patient in Brazil. Emergence of vancomycin resistance in methicillin-susceptible S. aureus would indicate that this resistance trait might be poised to disseminate more rapidly among S. aureus and represents a major public health threat.


July 7, 2019  |  

Contiguity: Contig adjacency graph construction and visualisation

Contiguity is interactive software for the visualization and manipulation of de novo genome assemblies. 14 Contiguity creates and displays information on contig adjacency which is contextualized by the 15 simultaneous display of a comparison between assembled contigs and reference sequence. Where 16 scaffolders allow unambiguous connections between contigs to be resolved into a single scaffold, 17 Contiguity allows the user to create all potential scaffolds in ambiguous regions of the genome. This 18 enables the resolution of novel sequence or structural variants from the assembly. In addition, 19 Contiguity provides a sequencing and assembly agnostic approach for the creation of contig adjacency 20 graphs. To maximize the number of contig adjacencies determined, Contiguity combines information 21 from read pair mappings, sequence overlap and De Bruijn graph exploration. We demonstrate how 22 highly sensitive graphs can be achieved using this method. Contig adjacency graphs allow the user to 23 visualize potential arrangements of contigs in unresolvable areas of the genome. By combining 24 adjacency information with comparative genomics, Contiguity provides an intuitive approach for 25 exploring and improving sequence assemblies. It is also useful in guiding manual closure of long read 26 sequence assemblies. Contiguity is an open source application, implemented using Python and the 27 Tkinter GUI package that can run on any Unix, OSX and Windows operating system. It has been 28 designed and optimized for bacterial assemblies. Contiguity is available at 29 http://mjsull.github.io/Contiguity .


July 7, 2019  |  

Genome analysis of Staphylococcus agnetis, an agent of lameness in broiler chickens.

Lameness in broiler chickens is a significant animal welfare and financial issue. Lameness can be enhanced by rearing young broilers on wire flooring. We have identified Staphylococcus agnetis as significantly involved in bacterial chondronecrosis with osteomyelitis (BCO) in proximal tibia and femorae, leading to lameness in broiler chickens in the wire floor system. Administration of S. agnetis in water induces lameness. Previously reported in some cases of cattle mastitis, this is the first report of this poorly described pathogen in chickens. We used long and short read next generation sequencing to assemble single finished contigs for the genome and a large plasmid from the chicken pathogen. Comparison of the S. agnetis genome to those of other pathogenic Staphylococci shows that S.agnetis contains a distinct repertoire of virulence determinants. Additionally, the S. agnetis genome has several regions that differ substantially from the genomes of other pathogenic Staphylococci. Comparison of our finished genome to a recent draft genome for a cattle mastitis isolate suggests that future investigations focus on the evolutionary epidemiology of this emerging pathogen of domestic animals.


July 7, 2019  |  

Leafy spurge genomics: A model perennial weed to investigate development, stress responses, and invasiveness

Leafy spurge is wild flower native to Europe that has become an invasive perennial weed in the northern great plains of the USA and Canada. Leafy spurge primarily infests range and recreation lands and costs US land managers millions dollars annually. In its invaded range, leafy spurge can form vast monocultures that significantly impact native flora and fauna and has been attributed to reduced populations of endangered species such as the prairie fringed orchid. Leafy spurge has remarkable plasticity and can persist under environmental extremes—primarily due to the formation of hundreds of underground adventitious buds that can form on its extensive and deep root system. We have developed genomics-based tools to assist our investigations related to vegetative production from these underground buds, as well as its responses to stress, and the potential mechanisms leading to the invasiveness of leafy spurge. Towards these ends, we have utilized Sanger-based sequencing to develop EST-databases from leafy spurge and cassava (a related species) transcriptomes, and developed textasciitilde23,000 element cDNA microarrays representing all of the unigenes identified in these databases. Additionally, numerous cDNA libraries and genomic libraries have been developed including bacterial artificial chromosome libraries useful for identifying and characterizing promoters of differentially expressed genes. Finally, to enhance our ability to identify promoter sequences and transcription factors involved in vegetative production, stress responses, and invasiveness, we have incorporated next generation sequencing approaches to fully sequence the leafy spurge genome. Using global transcriptome profiles, next generation sequencing, bioinformatics programs has provided insights into molecular mechanisms and regulatory pathways that make leafy spurge a particularly invasive and difficult weed to control.


July 7, 2019  |  

Improved draft genome sequence of Clostridium pasteurianum strain ATCC 6013 (DSM 525) using a hybrid next-generation sequencing approach.

We present an improved draft genome sequence for Clostridium pasteurianum strain ATCC 6013 (DSM 525), the type strain of the species and an important solventogenic bacterium with industrial potential. Availability of a near-complete genome sequence will enable strain engineering of this promising bacterium. Copyright © 2014 Pyne et al.


July 7, 2019  |  

Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences.

To assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences.Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as an additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies.All assembly tools except CLC Genomics Workbench are freely available under GNU General Public License.brownsd@ornl.govSupplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.


July 7, 2019  |  

Association mapping, patterns of linkage disequilibrium and selection in the vicinity of the PHYTOCHROME C gene in pearl millet.

Linkage analysis confirmed the association in the region of PHYC in pearl millet. The comparison of genes found in this region suggests that PHYC is the best candidate. Major efforts are currently underway to dissect the phenotype-genotype relationship in plants and animals using existing populations. This method exploits historical recombinations accumulated in these populations. However, linkage disequilibrium sometimes extends over a relatively long distance, particularly in genomic regions containing polymorphisms that have been targets for selection. In this case, many genes in the region could be statistically associated with the trait shaped by the selected polymorphism. Statistical analyses could help in identifying the best candidate genes into such a region where an association is found. In a previous study, we proposed that a fragment of the PHYTOCHROME C gene (PHYC) is associated with flowering time and morphological variations in pearl millet. In the present study, we first performed linkage analyses using three pearl millet F2 families to confirm the presence of a QTL in the vicinity of PHYC. We then analyzed a wider genomic region of ~100 kb around PHYC to pinpoint the gene that best explains the association with the trait in this region. A panel of 90 pearl millet inbred lines was used to assess the association. We used a Markov chain Monte Carlo approach to compare 75 markers distributed along this 100-kb region. We found the best candidate markers on the PHYC gene. Signatures of selection in this region were assessed in an independent data set and pointed to the same gene. These results foster confidence in the likely role of PHYC in phenotypic variation and encourage the development of functional studies.


July 7, 2019  |  

Global phylogenomic analysis of nonencapsulated Streptococcus pneumoniae reveals a deep-branching classic lineage that is distinct from multiple sporadic lineages.

The surrounding capsule of Streptococcus pneumoniae has been identified as a major virulence factor and is targeted by pneumococcal conjugate vaccines (PCV). However, nonencapsulated S. pneumoniae (non-Ec-Sp) have also been isolated globally, mainly in carriage studies. It is unknown if non-Ec-Sp evolve sporadically, if they have high antibiotic nonsusceptiblity rates and a unique, specific gene content. Here, whole-genome sequencing of 131 non-Ec-Sp isolates sourced from 17 different locations around the world was performed. Results revealed a deep-branching classic lineage that is distinct from multiple sporadic lineages. The sporadic lineages clustered with a previously sequenced, global collection of encapsulated S. pneumoniae (Ec-Sp) isolates while the classic lineage is comprised mainly of the frequently identified multilocus sequences types (STs) ST344 (n = 39) and ST448 (n = 40). All ST344 and nine ST448 isolates had high nonsusceptiblity rates to ß-lactams and other antimicrobials. Analysis of the accessory genome reveals that the classic non-Ec-Sp contained an increased number of mobile elements, than Ec-Sp and sporadic non-Ec-Sp. Performing adherence assays to human epithelial cells for selected classic and sporadic non-Ec-Sp revealed that the presence of a integrative conjugative element (ICE) results in increased adherence to human epithelial cells (P = 0.005). In contrast, sporadic non-Ec-Sp lacking the ICE had greater growth in vitro possibly resulting in improved fitness. In conclusion, non-Ec-Sp isolates from the classic lineage have evolved separately. They have spread globally, are well adapted to nasopharyngeal carriage and are able to coexist with Ec-Sp. Due to continued use of PCV, non-Ec-Sp may become more prevalent. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.