pacbio data Archives - Page 14 of 21

July 7, 2019

Comparative analysis of Ralstonia solanacearum methylomes.

Ralstonia solanacearum is an important soil-borne plant pathogen with broad geographical distribution and the ability to cause wilt disease in many agriculturally important crops. Genome sequencing of multiple R. solanacearum strains has identified both unique and shared genetic traits influencing their evolution and ability to colonize plant hosts. Previous research has shown that DNA methylation can drive speciation and modulate virulence in bacteria, but the impact of epigenetic modifications on the diversification and pathogenesis of R. solanacearum is unknown. Sequencing of R. solanacearum strains GMI1000 and UY031 using Single Molecule Real-Time technology allowed us to perform a comparative analysis of R. solanacearum methylomes. Our analysis identified a novel methylation motif associated with a DNA methylase that is conserved in all complete Ralstonia spp. genomes and across the Burkholderiaceae, as well as a methylation motif associated to a phage-borne methylase unique to R. solanacearum UY031. Comparative analysis of the conserved methylation motif revealed that it is most prevalent in gene promoter regions, where it displays a high degree of conservation detectable through phylogenetic footprinting. Analysis of hyper- and hypo-methylated loci identified several genes involved in global and virulence regulatory functions whose expression may be modulated by DNA methylation. Analysis of genome-wide modification patterns identified a significant correlation between DNA modification and transposase genes in R. solanacearum UY031, driven by the presence of a high copy number of ISrso3 insertion sequences in this genome and pointing to a novel mechanism for regulation of transposition. These results set a firm foundation for experimental investigations into the role of DNA methylation in R. solanacearum evolution and its adaptation to different plants.

July 7, 2019

Genome-wide analysis of WOX genes in upland cotton and their expression pattern under different stresses.

WUSCHEL-related homeobox (WOX) family members play significant roles in plant growth and development, such as in embryo patterning, stem-cell maintenance, and lateral organ formation. The recently published cotton genome sequences allow us to perform comprehensive genome-wide analysis and characterization of WOX genes in cotton.In this study, we identified 21, 20, and 38 WOX genes in Gossypium arboreum (2n = 26, A2), G. raimondii (2n = 26, D5), and G. hirsutum (2n = 4x = 52, (AD)t), respectively. Sequence logos showed that homeobox domains were significantly conserved among the WOX genes in cotton, Arabidopsis, and rice. A total of 168 genes from three typical monocots and six dicots were naturally divided into three clades, which were further classified into nine sub-clades. A good collinearity was observed in the synteny analysis of the orthologs from At and Dt (t represents tetraploid) sub-genomes. Whole genome duplication (WGD) and segmental duplication within At and Dt sub-genomes played significant roles in the expansion of WOX genes, and segmental duplication mainly generated the WUS clade. Copia and Gypsy were the two major types of transposable elements distributed upstream or downstream of WOX genes. Furthermore, through comparison, we found that the exon/intron pattern was highly conserved between Arabidopsis and cotton, and the homeobox domain loci were also conserved between them. In addition, the expression pattern in different tissues indicated that the duplicated genes in cotton might have acquired new functions as a result of sub-functionalization or neo-functionalization. The expression pattern of WOX genes under different stress treatments showed that the different genes were induced by different stresses.In present work, WOX genes, classified into three clades, were identified in the upland cotton genome. Whole genome and segmental duplication were determined to be the two major impetuses for the expansion of gene numbers during the evolution. Moreover, the expression patterns suggested that the duplicated genes might have experienced a functional divergence. Together, these results shed light on the evolution of the WOX gene family, and would be helpful in future research.

July 7, 2019

A high-coverage draft genome of the mycalesine butterfly Bicyclus anynana.

The mycalesine butterfly Bicyclus anynana , the ‘Squinting bush brown’, is a model organism in the study of lepidopteran ecology, development and evolution. Here, we present a draft genome sequence for B. anynana to serve as a genomics resource for current and future studies of this important model species.Seven libraries with insert sizes ranging from 350 bp to 20 kb were constructed using DNA from an inbred female and sequenced using both Illumina and PacBio technology. 128 Gb raw Illumina data were filtered to 124 Gb and assembled to a final size of 475 Mb (~260X assembly coverage). Contigs were scaffolded using mate-pair, transcriptome and PacBio data into 10,800 sequences with an N50 of 638 kb (longest scaffold 5 Mb). The genome is comprised of 26% repetitive elements, and encodes a total of 22,642 predicted protein-coding genes. Recovery of a BUSCO set of core metazoan genes was almost complete (98%). Overall, these metrics compare well with other recently published lepidopteran genomes.We report a high-quality draft genome sequence for Bicyclus anynana . The genome assembly and annotated gene models are available at LepBase ( http://ensembl.lepbase.org/index.html ).

July 7, 2019

Comparative genomics of all three Campylobacter sputorum biovars and a novel cattle-associated C. sputorum clade.

Campylobacter sputorum is a non-thermotolerant campylobacter that is primarily isolated from food animals such as cattle and sheep. C. sputorum is also infrequently associated with human illness. Based on catalase and urease activity, three biovars are currently recognized within C. sputorum: bv. sputorum (catalase negative, urease negative), bv. fecalis (catalase positive, urease negative), and bv. paraureolyticus (catalase negative, urease positive). A multi-locus sequence typing (MLST) method was recently constructed for C. sputorum. MLST typing of several cattle-associated C. sputorum isolates suggested that they are members of a divergent C. sputorum clade. Although catalase positive, and thus technically bv. fecalis, the taxonomic position of these strains could not be determined solely by MLST. To further characterize C. sputorum, the genomes of four strains, representing all three biovars and the divergent clade, were sequenced to completion. Here we present a comparative genomic analysis of the four C. sputorum genomes. This analysis indicates that the three biovars and the cattle-associated strains are highly-related at the genome level with similarities in gene content. Furthermore, the four genomes are strongly syntenic with one or two minor inversions. However, substantial differences in gene content were observed among the three biovars. Finally, although the strain representing the cattle-associated isolates was shown to be C. sputorum, it is possible that this strain is a member of a novel C. sputorum subspecies; thus, these cattle-associated strains may form a second taxon within C. sputorum. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.

July 7, 2019

PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data.

High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable.© 2017 John Wiley & Sons Ltd.

July 7, 2019

N-glycan maturation mutants in Lotus japonicus for basic and applied glycoprotein research.

Studies of protein N-glycosylation are important for answering fundamental questions on the diverse functions of glycoproteins in plant growth and development. Here we generated and characterised a comprehensive collection of Lotus japonicusLORE1 insertion mutants, each lacking the activity of one of the 12 enzymes required for normal N-glycan maturation in the glycosylation machinery. The inactivation of the individual genes resulted in altered N-glycan patterns as documented using mass spectrometry and glycan-recognising antibodies, indicating successful identification of null mutations in the target glyco-genes. For example, both mass spectrometry and immunoblotting experiments suggest that proteins derived from the a1,3-fucosyltransferase (Lj3fuct) mutant completely lacked a1,3-core fucosylation. Mass spectrometry also suggested that the Lotus japonicus convicilin 2 was one of the main glycoproteins undergoing differential expression/N-glycosylation in the mutants. Demonstrating the functional importance of glycosylation, reduced growth and seed production phenotypes were observed for the mutant plants lacking functional mannosidase I, N-acetylglucosaminyltransferase I, and a1,3-fucosyltransferase, even though the relative protein composition and abundance appeared unaffected. The strength of our N-glycosylation mutant platform is the broad spectrum of resulting glycoprotein profiles and altered physiological phenotypes that can be produced from single, double, triple and quadruple mutants. This platform will serve as a valuable tool for elucidating the functional role of protein N-glycosylation in plants. Furthermore, this technology can be used to generate stable plant mutant lines for biopharmaceutical production of glycoproteins displaying relative homogeneous and mammalian-like N-glycosylation features.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.

July 7, 2019

Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.

Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner.Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly.Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies.

July 7, 2019

MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads.

We present a tool that combines fast mapping, error correction, and de novo assembly (MECAT; accessible at https://github.com/xiaochuanle/MECAT) for processing single-molecule sequencing (SMS) reads. MECAT’s computing efficiency is superior to that of current tools, while the results MECAT produces are comparable or improved. MECAT enables reference mapping or de novo assembly of large genomes using SMS reads on a single computer.

July 7, 2019

Complete genome sequence of a livestock-associated methicillin-resistant Staphylococcus aureus sequence type 5 isolate from the United States.

Livestock-associated methicillin-resistant Staphylococcus aureus (LA-MRSA) may be the largest MRSA reservoir outside the hospital setting. One concern with LA-MRSA is the acquisition of novel mobile genetic elements by these isolates. Here, we report the complete genome sequence of a swine LA-MRSA sequence type 5 isolate from the United States.

July 7, 2019

LRCstats, a tool for evaluating long reads correction methods.

Third-generation sequencing (TGS) platforms that generate long reads, such as PacBio and Oxford Nanopore technologies, have had a dramatic impact on genomics research. However, despite recent improvements, TGS reads suffer from high-error rates and the development of read correction methods is an active field of research. This motivates the need to develop tools that can evaluate the accuracy of noisy long reads correction tools.We introduce LRCstats, a tool that measures the accuracy of long reads correction tools. LRCstats takes advantage of long reads simulators that provide each simulated read with an alignment to the reference genome segment they originate from, and does not rely on a step of mapping corrected reads onto the reference genome. This allows for the measurement of the accuracy of the correction while being consistent with the actual errors introduced in the simulation process used to generate noisy reads. We illustrate the usefulness of LRCstats by analyzing the accuracy of four hybrid correction methods for PacBio long reads over three datasets.https://github.com/cchauve/lrcstats.laseanl@sfu.ca or cedric.chauve@sfu.ca.Supplementary data are available at Bioinformatics online.© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

July 7, 2019

Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments.

Pearl millet [Cenchrus americanus (L.) Morrone] is a staple food for more than 90 million farmers in arid and semi-arid regions of sub-Saharan Africa, India and South Asia. We report the ~1.79 Gb draft whole genome sequence of reference genotype Tift 23D2B1-P1-P5, which contains an estimated 38,579 genes. We highlight the substantial enrichment for wax biosynthesis genes, which may contribute to heat and drought tolerance in this crop. We resequenced and analyzed 994 pearl millet lines, enabling insights into population structure, genetic diversity and domestication. We use these resequencing data to establish marker trait associations for genomic selection, to define heterotic pools, and to predict hybrid performance. We believe that these resources should empower researchers and breeders to improve this important staple crop.

July 7, 2019

Complete genome sequence of livestock-associated methicillin-resistant Staphylococcus aureus sequence type 398 isolated from swine in the United States.

Methicillin-resistant Staphylococcus aureus (MRSA) colonizes and causes disease in many animal species. Livestock-associated MRSA (LA-MRSA) isolates are represented by isolates of the sequence type 398 (ST398). These isolates are considered to be livestock adapted. This report provides the complete genome sequence of one swine-associated LA-MRSA ST398 isolate from the United States.

July 7, 2019

Complete genome sequence of Sulfuriferula sp. strain AH1, a sulfur-oxidizing autotroph isolated from weathered mine tailings from the Duluth Complex in Minnesota.

We report the closed and annotated genome sequence of Sulfuriferula sp. strain AH1. Strain AH1 has a 2,877,007-bp chromosome that includes a partial Sox system for inorganic sulfur oxidation and a complete nitrogen fixation pathway. It also has a single 39,138-bp plasmid with genes for arsenic and mercury resistance. Copyright © 2017 Jones et al.

July 7, 2019

The Apostasia genome and the evolution of orchids.

Constituting approximately 10% of flowering plant species, orchids (Orchidaceae) display unique flower morphologies, possess an extraordinary diversity in lifestyle, and have successfully colonized almost every habitat on Earth. Here we report the draft genome sequence of Apostasia shenzhenica, a representative of one of two genera that form a sister lineage to the rest of the Orchidaceae, providing a reference for inferring the genome content and structure of the most recent common ancestor of all extant orchids and improving our understanding of their origins and evolution. In addition, we present transcriptome data for representatives of Vanilloideae, Cypripedioideae and Orchidoideae, and novel third-generation genome data for two species of Epidendroideae, covering all five orchid subfamilies. A. shenzhenica shows clear evidence of a whole-genome duplication, which is shared by all orchids and occurred shortly before their divergence. Comparisons between A. shenzhenica and other orchids and angiosperms also permitted the reconstruction of an ancestral orchid gene toolkit. We identify new gene families, gene family expansions and contractions, and changes within MADS-box gene classes, which control a diverse suite of developmental processes, during orchid evolution. This study sheds new light on the genetic mechanisms underpinning key orchid innovations, including the development of the labellum and gynostemium, pollinia, and seeds without endosperm, as well as the evolution of epiphytism; reveals relationships between the Orchidaceae subfamilies; and helps clarify the evolutionary history of orchids within the angiosperms.

July 7, 2019

Single-molecule sequencing and Hi-C-based proximity-guided assembly of amaranth (Amaranthus hypochondriacus) chromosomes provide insights into genome evolution.

Amaranth (Amaranthus hypochondriacus) was a food staple among the ancient civilizations of Central and South America that has recently received increased attention due to the high nutritional value of the seeds, with the potential to help alleviate malnutrition and food security concerns, particularly in arid and semiarid regions of the developing world. Here, we present a reference-quality assembly of the amaranth genome which will assist the agronomic development of the species.Utilizing single-molecule, real-time sequencing (Pacific Biosciences) and chromatin interaction mapping (Hi-C) to close assembly gaps and scaffold contigs, respectively, we improved our previously reported Illumina-based assembly to produce a chromosome-scale assembly with a scaffold N50 of 24.4 Mb. The 16 largest scaffolds contain 98% of the assembly and likely represent the haploid chromosomes (n?=?16). To demonstrate the accuracy and utility of this approach, we produced physical and genetic maps and identified candidate genes for the betalain pigmentation pathway. The chromosome-scale assembly facilitated a genome-wide syntenic comparison of amaranth with other Amaranthaceae species, revealing chromosome loss and fusion events in amaranth that explain the reduction from the ancestral haploid chromosome number (n?=?18) for a tetraploid member of the Amaranthaceae.The assembly method reported here minimizes cost by relying primarily on short-read technology and is one of the first reported uses of in vivo Hi-C for assembly of a plant genome. Our analyses implicate chromosome loss and fusion as major evolutionary events in the 2n?=?32 amaranths and clearly establish the homoeologous relationship among most of the subgenome chromosomes, which will facilitate future investigations of intragenomic changes that occurred post polyploidization.

Auto Tag: pacbio data

Comparative analysis of Ralstonia solanacearum methylomes.

Genome-wide analysis of WOX genes in upland cotton and their expression pattern under different stresses.

A high-coverage draft genome of the mycalesine butterfly Bicyclus anynana.

Comparative genomics of all three Campylobacter sputorum biovars and a novel cattle-associated C. sputorum clade.

PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data.

N-glycan maturation mutants in Lotus japonicus for basic and applied glycoprotein research.

Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.

MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads.

Complete genome sequence of a livestock-associated methicillin-resistant Staphylococcus aureus sequence type 5 isolate from the United States.

LRCstats, a tool for evaluating long reads correction methods.

Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments.

Complete genome sequence of livestock-associated methicillin-resistant Staphylococcus aureus sequence type 398 isolated from swine in the United States.

Complete genome sequence of Sulfuriferula sp. strain AH1, a sulfur-oxidizing autotroph isolated from weathered mine tailings from the Duluth Complex in Minnesota.

The Apostasia genome and the evolution of orchids.

Single-molecule sequencing and Hi-C-based proximity-guided assembly of amaranth (Amaranthus hypochondriacus) chromosomes provide insights into genome evolution.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert