Read length Archives - Page 15 of 29

September 22, 2019

Complete genome sequencing of the luminescent bacterium, Vibrio qinghaiensis sp. Q67 using PacBio technology.

Vibrio qinghaiensis sp.-Q67 (Vqin-Q67) is a freshwater luminescent bacterium that continuously emits blue-green light (485?nm). The bacterium has been widely used for detecting toxic contaminants. Here, we report the complete genome sequence of Vqin-Q67, obtained using third-generation PacBio sequencing technology. Continuous long reads were attained from three PacBio sequencing runs and reads >500?bp with a quality value of >0.75 were merged together into a single dataset. This resultant highly-contiguous de novo assembly has no genome gaps, and comprises two chromosomes with substantial genetic information, including protein-coding genes, non-coding RNA, transposon and gene islands. Our dataset can be useful as a comparative genome for evolution and speciation studies, as well as for the analysis of protein-coding gene families, the pathogenicity of different Vibrio species in fish, the evolution of non-coding RNA and transposon, and the regulation of gene expression in relation to the bioluminescence of Vqin-Q67.

September 22, 2019

Genome sequence of the Japanese oak silk moth, Antheraea yamamai: the first draft genome in the family Saturniidae.

Antheraea yamamai, also known as the Japanese oak silk moth, is a wild species of silk moth. Silk produced by A. yamamai, referred to as tensan silk, shows different characteristics such as thickness, compressive elasticity, and chemical resistance compared with common silk produced from the domesticated silkworm, Bombyx mori. Its unique characteristics have led to its use in many research fields including biotechnology and medical science, and the scientific as well as economic importance of the wild silk moth continues to gradually increase. However, no genomic information for the wild silk moth, including A. yamamai, is currently available.In order to construct the A. yamamai genome, a total of 147G base pairs using Illumina and Pacbio sequencing platforms were generated, providing 210-fold coverage based on the 700-Mb estimated genome size of A. yamamai. The assembled genome of A. yamamai was 656 Mb (>2 kb) with 3675 scaffolds, and the N50 length of assembly was 739 Kb with a 34.07% GC ratio. Identified repeat elements covered 37.33% of the total genome, and the completeness of the constructed genome assembly was estimated to be 96.7% by Benchmarking Universal Single-Copy Orthologs v2 analysis. A total of 15 481 genes were identified using Evidence Modeler based on the gene prediction results obtained from 3 different methods (ab initio, RNA-seq-based, known-gene-based) and manual curation.Here we present the genome sequence of A. yamamai, the first genome sequence of the wild silk moth. These results provide valuable genomic information, which will help enrich our understanding of the molecular mechanisms relating to not only specific phenotypes such as wild silk itself but also the genomic evolution of Saturniidae.© The Authors 2017. Published by Oxford University Press.

September 22, 2019

MUMmer4: A fast and versatile genome alignment system.

The MUMmer system and the genome sequence aligner nucmer included within it are among the most widely used alignment packages in genomics. Since the last major release of MUMmer version 3 in 2004, it has been applied to many types of problems including aligning whole genome sequences, aligning reads to a reference genome, and comparing different assemblies of the same genome. Despite its broad utility, MUMmer3 has limitations that can make it difficult to use for large genomes and for the very large sequence data sets that are common today. In this paper we describe MUMmer4, a substantially improved version of MUMmer that addresses genome size constraints by changing the 32-bit suffix tree data structure at the core of MUMmer to a 48-bit suffix array, and that offers improved speed through parallel processing of input query sequences. With a theoretical limit on the input size of 141Tbp, MUMmer4 can now work with input sequences of any biologically realistic length. We show that as a result of these enhancements, the nucmer program in MUMmer4 is easily able to handle alignments of large genomes; we illustrate this with an alignment of the human and chimpanzee genomes, which allows us to compute that the two species are 98% identical across 96% of their length. With the enhancements described here, MUMmer4 can also be used to efficiently align reads to reference genomes, although it is less sensitive and accurate than the dedicated read aligners. The nucmer aligner in MUMmer4 can now be called from scripting languages such as Perl, Python and Ruby. These improvements make MUMer4 one the most versatile genome alignment packages available.

September 22, 2019

Reference assembly and annotation of the Pyrenophora teres f. teres isolate 0-1.

Pyrenophora teres f.teres, the causal agent of net form net blotch (NFNB) of barley, is a destructive pathogen in barley-growing regions throughout the world. Typical yield losses due to NFNB range from 10 to 40%; however, complete loss has been observed on highly susceptible barley lines where environmental conditions favor the pathogen. Currently, genomic resources for this economically important pathogen are limited to a fragmented draft genome assembly and annotation, with limited RNA support of theP. teresf.teresisolate 0-1. This research presents an updated 0-1 reference assembly facilitated by long-read sequencing and scaffolding with the assistance of genetic linkage maps. Additionally, genome annotation was mediated by RNAseq analysis using three infection time points and a pure culture sample, resulting in 11,541 high-confidence gene models. The 0-1 genome assembly and annotation presented here now contains the majority of the repetitive content of the genome. Analysis of the 0-1 genome revealed classic characteristics of a “two-speed” genome, being compartmentalized into GC-equilibrated and AT-rich compartments. The assembly of repetitive AT-rich regions will be important for future investigation of genes known as effectors, which often reside in close proximity to repetitive regions. These effectors are responsible for manipulation of the host defense during infection. This updatedP. teresf.teresisolate 0-1 reference genome assembly and annotation provides a robust resource for the examination of the barley-P. teresf.tereshost-pathogen coevolution. Copyright © 2018 Wyatt et al.

September 22, 2019

First draft genome of an iconic clownfish species (Amphiprion frenatus).

Clownfishes (or anemonefishes) form an iconic group of coral reef fishes, principally known for their mutualistic interaction with sea anemones. They are characterized by particular life history traits, such as a complex social structure and mating system involving sequential hermaphroditism, coupled with an exceptionally long lifespan. Additionally, clownfishes are considered to be one of the rare groups to have experienced an adaptive radiation in the marine environment. Here, we assembled and annotated the first genome of a clownfish species, the tomato clownfish (Amphiprion frenatus). We obtained 17,801 assembled scaffolds, containing a total of 26,917 genes. The completeness of the assembly and annotation was satisfying, with 96.5% of the Actinopterygii Benchmarking Universal Single-Copy Orthologs (BUSCOs) being retrieved in A. frenatus assembly. The quality of the resulting assembly is comparable to other bony fish assemblies. This resource is valuable for advancing studies of the particular life history traits of clownfishes, as well as being useful for population genetic studies and the development of new phylogenetic markers. It will also open the way to comparative genomics. Indeed, future genomic comparison among closely related fishes may provide means to identify genes related to the unique adaptations to different sea anemone hosts, as well as better characterize the genomic signatures of an adaptive radiation.© 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

September 22, 2019

High-quality assembly of Dermatophagoides pteronyssinus genome and transcriptome reveals a wide range of novel allergens.

House dust mites (HDM) are a predominant source of inhalant allergens that attribute to over 50% of worldwide allergy cases, while the full spectrum of HDM allergens remains unknown. Here we sequenced a high-quality genome of Dermatophagoides (D.) pteronyssinus to find known canonical allergens and allergen orthologs inferred from D. farinae genome.

September 22, 2019

The genome sequence of the commercially cultivated mushroom Agrocybe aegerita reveals a conserved repertoire of fruiting-related genes and a versatile suite of biopolymer-degrading enzymes.

Agrocybe aegerita is an agaricomycete fungus with typical mushroom features, which is commercially cultivated for its culinary use. In nature, it is a saprotrophic or facultative pathogenic fungus causing a white-rot of hardwood in forests of warm and mild climate. The ease of cultivation and fructification on solidified media as well as its archetypal mushroom fruit body morphology render A. aegerita a well-suited model for investigating mushroom developmental biology.Here, the genome of the species is reported and analysed with respect to carbohydrate active genes and genes known to play a role during fruit body formation. In terms of fruit body development, our analyses revealed a conserved repertoire of fruiting-related genes, which corresponds well to the archetypal fruit body morphology of this mushroom. For some genes involved in fruit body formation, paralogisation was observed, but not all fruit body maturation-associated genes known from other agaricomycetes seem to be conserved in the genome sequence of A. aegerita. In terms of lytic enzymes, our analyses suggest a versatile arsenal of biopolymer-degrading enzymes that likely account for the flexible life style of this species. Regarding the amount of genes encoding CAZymes relevant for lignin degradation, A. aegerita shows more similarity to white-rot fungi than to litter decomposers, including 18 genes coding for unspecific peroxygenases and three dye-decolourising peroxidase genes expanding its lignocellulolytic machinery.The genome resource will be useful for developing strategies towards genetic manipulation of A. aegerita, which will subsequently allow functional genetics approaches to elucidate fundamentals of fruiting and vegetative growth including lignocellulolysis.

September 22, 2019

Complete genome sequence of Lactobacillus pentosus SLC13, isolated from mustard pickles, a potential probiotic strain with antimicrobial activity against foodborne pathogenic microorganisms.

Lactobacillus pentosus SLC13 is a high exopolysaccharide (EPS)-producing strain with broad-spectrum antimicrobial activity and the ability to grow in simulated gastrointestinal conditions. SLC13 was isolated from mustard pickles in Taiwan for potential probiotic applications. To better understand the molecular base for its antimicrobial activity and high EPS production, entire genome of SLC13 was determined by PacBio SMRT sequencing.L. pentosus SLC13 contains a genome with a 3,520,510-bp chromosome and a 62,498-bp plasmid. GC content of the complete genome was 46.5% and that of plasmid pSLC13 was 41.3%. Sequences were annotated at the RAST prokaryotic genome annotation server, and the results showed that the genome contained 3172 coding sequences and 82 RNA genes. Seventy-six protein-coding sequences were identified on the plasmid pSLC13. A plantaricin gene cluster, which is responsible for bacteriosins biosynthesis and could be associated with its broad-spectrum antimicrobial activity, was identified based on comparative genomic analysis. Two gene clusters involved in EPS production were also identified.This genomic sequence might contribute to a future application of this strain as probiotic in productive livestock potentially inhibiting competing and pathogenic organisms.

September 22, 2019

The genomes of Crithidia bombi and C. expoeki, common parasites of bumblebees.

Trypanosomatids (Trypanosomatidae, Kinetoplastida) are flagellated protozoa containing many parasites of medical or agricultural importance. Among those, Crithidia bombi and C. expoeki, are common parasites in bumble bees around the world, and phylogenetically close to Leishmania and Leptomonas. They have a simple and direct life cycle with one host, and partially castrate the founding queens greatly reducing their fitness. Here, we report the nuclear genome sequences of one clone of each species, extracted from a field-collected infection. Using a combination of Roche 454 FLX Titanium, Pacific Biosciences PacBio RS, and Illumina GA2 instruments for C. bombi, and PacBio for C. expoeki, we could produce high-quality and well resolved sequences. We find that these genomes are around 32 and 34 MB, with 7,808 and 7,851 annotated genes for C. bombi and C. expoeki, respectively-which is somewhat less than reported from other trypanosomatids, with few introns, and organized in polycistronic units. A large fraction of genes received plausible functional support in comparison primarily with Leishmania and Trypanosoma. Comparing the annotated genes of the two species with those of six other trypanosomatids (C. fasciculata, L. pyrrhocoris, L. seymouri, B. ayalai, L. major, and T. brucei) shows similar gene repertoires and many orthologs. Similar to other trypanosomatids, we also find signs of concerted evolution in genes putatively involved in the interaction with the host, a high degree of synteny between C. bombi and C. expoeki, and considerable overlap with several other species in the set. A total of 86 orthologous gene groups show signatures of positive selection in the branch leading to the two Crithidia under study, mostly of unknown function. As an example, we examined the initiating glycosylation pathway of surface components in C. bombi, finding it deviates from most other eukaryotes and also from other kinetoplastids, which may indicate rapid evolution in the extracellular matrix that is involved in interactions with the host. Bumble bees are important pollinators and Crithidia-infections are suspected to cause substantial selection pressure on their host populations. These newly sequenced genomes provide tools that should help better understand host-parasite interactions in these pollinator pathogens.

September 22, 2019

SimulaTE: simulating complex landscapes of transposable elements of populations.

Motivation Estimating the abundance of transposable elements (TEs) in populations (or tissues) promises to answer many open research questions. However, progress is hampered by the lack of concordance between different approaches for TE identification and thus potentially unreliable results. Results To address this problem, we developed SimulaTE a tool that generates TE landscapes for populations using a newly developed domain specific language (DSL). The simple syntax of our DSL allows for easily building even complex TE landscapes that have, for example, nested, truncated and highly diverged TE insertions. Reads may be simulated for the populations using different sequencing technologies (PacBio, Illumina paired-ends) and strategies (sequencing individuals and pooled populations). The comparison between the expected (i.e. simulated) and the observed results will guide researchers in finding the most suitable approach for a particular research question. Availability and implementation SimulaTE is implemented in Python and available at https://sourceforge.net/projects/simulates/. Manual https://sourceforge.net/p/simulates/wiki/Home/#manual; Test data and tutorials https://sourceforge.net/p/simulates/wiki/Home/#walkthrough; Validation https://sourceforge.net/p/simulates/wiki/Home/#validation. Contact robert.kofler@vetmeduni.ac.at

September 22, 2019

Pangenome analyses of the wheat pathogen Zymoseptoria tritici reveal the structural basis of a highly plastic eukaryotic genome.

Structural variation contributes substantially to polymorphism within species. Chromosomal rearrangements that impact genes can lead to functional variation among individuals and influence the expression of phenotypic traits. Genomes of fungal pathogens show substantial chromosomal polymorphism that can drive virulence evolution on host plants. Assessing the adaptive significance of structural variation is challenging, because most studies rely on inferences based on a single reference genome sequence.We constructed and analyzed the pangenome of Zymoseptoria tritici, a major pathogen of wheat that evolved host specialization by chromosomal rearrangements and gene deletions. We used single-molecule real-time sequencing and high-density genetic maps to assemble multiple genomes. We annotated the gene space based on transcriptomics data that covered the infection life cycle of each strain. Based on a total of five telomere-to-telomere genomes, we constructed a pangenome for the species and identified a core set of 9149 genes. However, an additional 6600 genes were exclusive to a subset of the isolates. The substantial accessory genome encoded on average fewer expressed genes but a larger fraction of the candidate effector genes that may interact with the host during infection. We expanded our analyses of the pangenome to a worldwide collection of 123 isolates of the same species. We confirmed that accessory genes were indeed more likely to show deletion polymorphisms and loss-of-function mutations compared to core genes.The pangenome construction of a highly polymorphic eukaryotic pathogen showed that a single reference genome significantly underestimates the gene space of a species. The substantial accessory genome provides a cradle for adaptive evolution.

September 22, 2019

Characterization of plasmids harboring blaCTX-M and blaCMY genes in E. coli from French broilers.

Resistance to extended-spectrum cephalosporins (ESC) is a global health issue. The aim of this study was to analyze and compare plasmids coding for resistance to ESC isolated from 16 avian commensal and 17 avian pathogenic Escherichia coli (APEC) strains obtained respectively at slaughterhouse or from diseased broilers in 2010-2012. Plasmid DNA was used to transform E. coli DH5alpha, and the resistances of the transformants were determined. The sequences of the ESC-resistance plasmids prepared from transformants were obtained by Illumina (33 plasmids) or PacBio (1 plasmid). Results showed that 29 of these plasmids contained the blaCTX-M-1 gene and belonged to the IncI1/ST3 type, with 27 and 20 of them carrying the sul2 or tet(A) genes respectively. Despite their diverse origins, several plasmids showed very high percentages of identity. None of the blaCTX-M-1-containing plasmid contained APEC virulence genes, although some of them were detected in the parental strains. Three plasmids had the blaCMY-2 gene, but no other resistance gene. They belonged to IncB/O/K/Z-like or IncFIA/FIB replicon types. The blaCMY-2 IncFIA/FIB plasmid was obtained from a strain isolated from a diseased broiler and also containing a blaCTX-M-1 IncI1/ST3 plasmid. Importantly APEC virulence genes (sitA-D, iucA-D, iutA, hlyF, ompT, etsA-C, iss, iroB-E, iroN, cvaA-C and cvi) were detected on the blaCMY-2 plasmid. In conclusion, our results show the dominance and high similarity of blaCTX-M-1 IncI1/ST3 plasmids, and the worrying presence of APEC virulence genes on a blaCMY-2 plasmid.

September 22, 2019

Candidatus Nitrosocaldus cavascurensis, an ammonia oxidizing, extremely thermophilic archaeon with a highly mobile genome.

Ammonia oxidizing archaea (AOA) of the phylum Thaumarchaeota are widespread in moderate environments but their occurrence and activity has also been demonstrated in hot springs. Here we present the first enrichment of a thermophilic representative with a sequenced genome, which facilitates the search for adaptive strategies and for traits that shape the evolution of Thaumarchaeota.CandidatusNitrosocaldus cavascurensis has been enriched from a hot spring in Ischia, Italy. It grows optimally at 68°C under chemolithoautotrophic conditions on ammonia or urea converting ammonia stoichiometrically into nitrite with a generation time of approximately 23 h. Phylogenetic analyses based on ribosomal proteins place the organism as a sister group to all known mesophilic AOA. The 1.58 Mb genome ofCa.N. cavascurensis harbors anamoAXCB gene cluster encoding ammonia monooxygenase and genes for a 3-hydroxypropionate/4-hydroxybutyrate pathway for autotrophic carbon fixation, but also genes that indicate potential alternative energy metabolisms. Although abona fidegene for nitrite reductase is missing, the organism is sensitive to NO-scavenging, underlining the potential importance of this compound for AOA metabolism.Ca.N. cavascurensis is distinct from all other AOA in its gene repertoire for replication, cell division and repair. Its genome has an impressive array of mobile genetic elements and other recently acquired gene sets, including conjugative systems, a provirus, transposons and cell appendages. Some of these elements indicate recent exchange with the environment, whereas others seem to have been domesticated and might convey crucial metabolic traits.

September 22, 2019

Comparative genomics of completely sequenced Lactobacillus helveticus genomes provides insights into strain-specific genes and resolves metagenomics data down to the strain level.

Although complete genome sequences hold particular value for an accurate description of core genomes, the identification of strain-specific genes, and as the optimal basis for functional genomics studies, they are still largely underrepresented in public repositories. Based on an assessment of the genome assembly complexity for all lactobacilli, we used Pacific Biosciences’ long read technology to sequence and de novo assemble the genomes of three Lactobacillus helveticus starter strains, raising the number of completely sequenced strains to 12. The first comparative genomics study for L. helveticus-to our knowledge-identified a core genome of 988 genes and sets of unique, strain-specific genes ranging from about 30 to more than 200 genes. Importantly, the comparison of MiSeq- and PacBio-based assemblies uncovered that not only accessory but also core genes can be missed in incomplete genome assemblies based on short reads. Analysis of the three genomes revealed that a large number of pseudogenes were enriched for functional Gene Ontology categories such as amino acid transmembrane transport and carbohydrate metabolism, which is in line with a reductive genome evolution in the rich natural habitat of L. helveticus. Notably, the functional Clusters of Orthologous Groups of proteins categories “cell wall/membrane biogenesis” and “defense mechanisms” were found to be enriched among the strain-specific genes. A genome mining effort uncovered examples where an experimentally observed phenotype could be linked to the underlying genotype, such as for cell envelope proteinase PrtH3 of strain FAM8627. Another possible link identified for peptidoglycan hydrolases will require further experiments. Of note, strain FAM22155 did not harbor a CRISPR/Cas system; its loss was also observed in other L. helveticus strains and lactobacillus species, thus questioning the value of the CRISPR/Cas system for diagnostic purposes. Importantly, the complete genome sequences proved to be very useful for the analysis of natural whey starter cultures with metagenomics, as a larger percentage of the sequenced reads of these complex mixtures could be unambiguously assigned down to the strain level.

September 22, 2019

Redkmer: An Assembly-Free Pipeline for the Identification of Abundant and Specific X-Chromosome Target Sequences for X-Shredding by CRISPR Endonucleases.

CRISPR-based synthetic sex ratio distorters, which operate by shredding the X-chromosome during male meiosis, are promising tools for the area-wide control of harmful insect pest or disease vector species. X-shredders have been proposed as tools to suppress insect populations by biasing the sex ratio of the wild population toward males, thus reducing its natural reproductive potential. However, to build synthetic X-shredders based on CRISPR, the selection of gRNA targets, in the form of high-copy sequence repeats on the X chromosome of a given species, is difficult, since such repeats are not accurately resolved in genome assemblies and cannot be assigned to chromosomes with confidence. We have therefore developed the redkmer computational pipeline, designed to identify short and highly abundant sequence elements occurring uniquely on the X chromosome. Redkmer was designed to use as input minimally processed whole genome sequence data from males and females. We tested redkmer with short- and long-read whole genome sequence data of Anopheles gambiae, the major vector of human malaria, in which the X-shredding paradigm was originally developed. Redkmer established long reads as chromosomal proxies with excellent correlation to the genome assembly and used them to rank X-candidate kmers for their level of X-specificity and abundance. Among these, a high-confidence set of 25-mers was identified, many belonging to previously known X-chromosome repeats of Anopheles gambiae, including the ribosomal gene array and the selfish elements harbored within it. Data from a control strain, in which these repeats are shared with the Y chromosome, confirmed the elimination of these kmers during filtering. Finally, we show that redkmer output can be linked directly to gRNA selection and off-target prediction. In addition, the output of redkmer, including the prediction of chromosomal origin of single-molecule long reads and chromosome specific kmers, could also be used for the characterization of other biologically relevant sex chromosome sequences, a task that is frequently hampered by the repetitiveness of sex chromosome sequence content.

Auto Tag: Read length

Complete genome sequencing of the luminescent bacterium, Vibrio qinghaiensis sp. Q67 using PacBio technology.

Genome sequence of the Japanese oak silk moth, Antheraea yamamai: the first draft genome in the family Saturniidae.

MUMmer4: A fast and versatile genome alignment system.

Reference assembly and annotation of the Pyrenophora teres f. teres isolate 0-1.

First draft genome of an iconic clownfish species (Amphiprion frenatus).

High-quality assembly of Dermatophagoides pteronyssinus genome and transcriptome reveals a wide range of novel allergens.

The genome sequence of the commercially cultivated mushroom Agrocybe aegerita reveals a conserved repertoire of fruiting-related genes and a versatile suite of biopolymer-degrading enzymes.

Complete genome sequence of Lactobacillus pentosus SLC13, isolated from mustard pickles, a potential probiotic strain with antimicrobial activity against foodborne pathogenic microorganisms.

The genomes of Crithidia bombi and C. expoeki, common parasites of bumblebees.

SimulaTE: simulating complex landscapes of transposable elements of populations.

Pangenome analyses of the wheat pathogen Zymoseptoria tritici reveal the structural basis of a highly plastic eukaryotic genome.

Characterization of plasmids harboring blaCTX-M and blaCMY genes in E. coli from French broilers.

Candidatus Nitrosocaldus cavascurensis, an ammonia oxidizing, extremely thermophilic archaeon with a highly mobile genome.

Comparative genomics of completely sequenced Lactobacillus helveticus genomes provides insights into strain-specific genes and resolves metagenomics data down to the strain level.

Redkmer: An Assembly-Free Pipeline for the Identification of Abundant and Specific X-Chromosome Target Sequences for X-Shredding by CRISPR Endonucleases.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert