De novo assembly Archives - Page 16 of 324

April 21, 2020

The Chinese chestnut genome: a reference for species restoration

Forest tree species are increasingly subject to severe mortalities from exotic pests, diseases, and invasive organisms, accelerated by climate change. Forest health issues are threatening multiple species and ecosystem sustainability globally. While sources of resistance may be available in related species, or among surviving trees, introgression of resistance genes into threatened tree species in reasonable time frames requires genome-wide breeding tools. Asian species of chestnut (Castanea spp.) are being employed as donors of disease resistance genes to restore native chestnut species in North America and Europe. To aid in the restoration of threatened chestnut species, we present the assembly of a reference genome with chromosome-scale sequences for Chinese chestnut (C. mollissima), the disease-resistance donor for American chestnut restoration. We also demonstrate the value of the genome as a platform for research and species restoration, including new insights into the evolution of blight resistance in Asian chestnut species, the locations in the genome of ecologically important signatures of selection differentiating American chestnut from Chinese chestnut, the identification of candidate genes for disease resistance, and preliminary comparisons of genome organization with related species.

April 21, 2020

Soil Probiotic Utilizes Plant and Pollinator Transport for Territorial Expansion

Microbe-plant interactions are linked with the core microbiota, and both the plant and the microbial partners depend on one other to thrive in nature. However, why and how the below-ground core microbiota become established aboveground is poorly understood. We tracked the movement of a probiotic Streptomyces endophyte throughout a managed strawberry ecosystem. Probiotics in the rhizosphere and anthosphere were genetically identical, yet these niches were segregated in space and time. The probiotic in the rhizosphere moved upward via the vascular bundle, relocated to aboveground plant parts, and protected against Botrytis cinerea. It also moved from flowers to roots, and among flowers via pollinators that were protected against pollinator pathogens. Our results reveal a solid evidence in tripartite interaction with Streptomyces exploiting plant and pollinator partners.

April 21, 2020

Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline

Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and allow for annotation of TEs. There are numerous methods for each class of elements with unknown relative performance metrics. We benchmarked existing programs based on a curated library of rice TEs. Using the most robust programs, we created a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a condensed TE library for annotations of structurally intact and fragmented elements. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.List of abbreviationsTETransposable ElementsLTRLong Terminal RepeatLINELong Interspersed Nuclear ElementSINEShort Interspersed Nuclear ElementMITEMiniature Inverted Transposable ElementTIRTerminal Inverted RepeatTSDTarget Site DuplicationTPTrue PositivesFPFalse PositivesTNTrue NegativeFNFalse NegativesGRFGeneric Repeat FinderEDTAExtensive de-novo TE Annotator

April 21, 2020

Complete genome sequence of Paenisporosarcina antarctica CGMCC 1.6503 T, a marine psychrophilic bacterium isolated from Antarctica

A marine psychrophilic bacterium _Paenisporosarcina antarctica_ CGMCC 1.6503T (= JCM 14646T) was isolated off King George Island, Antarctica (62°13’31? S 58°57’08? W). In this study, we report the complete genome sequence of _Paenisporosarcina antarctica_, which is comprised of 3,972,524?bp with a mean G?+?C content of 37.0%. By gene function and metabolic pathway analyses, studies showed that strain CGMCC 1.6503T encodes a series of genes related to cold adaptation, including encoding fatty acid desaturases, dioxygenases, antifreeze proteins and cold shock proteins, and possesses several two-component regulatory systems, which could assist this strain in responding to the cold stress, the oxygen stress and the osmotic stress in Antarctica. The complete genome sequence of _P. antarctica_ may provide further insights into the genetic mechanism of cold adaptation for Antarctic marine bacteria.

April 21, 2020

Emergence of plasmid-mediated high-level tigecycline resistance genes in animals and humans.

Tigecycline is a last-resort antibiotic that is used to treat severe infections caused by extensively drug-resistant bacteria. tet(X) has been shown to encode a flavin-dependent monooxygenase that modifies tigecycline1,2. Here, we report two unique mobile tigecycline-resistance genes, tet(X3) and tet(X4), in numerous Enterobacteriaceae and Acinetobacter that were isolated from animals, meat for consumption and humans. Tet(X3) and Tet(X4) inactivate all tetracyclines, including tigecycline and the newly FDA-approved eravacycline and omadacycline. Both tet(X3) and tet(X4) increase (by 64-128-fold) the tigecycline minimal inhibitory concentration values for Escherichia coli, Klebsiella pneumoniae and Acinetobacter baumannii. In addition, both Tet(X3) (A. baumannii) and Tet(X4) (E. coli) significantly compromise tigecycline in in vivo infection models. Both tet(X3) and tet(X4) are adjacent to insertion sequence ISVsa3 on their respective conjugative plasmids and confer a mild fitness cost (relative fitness of >0.704). Database mining and retrospective screening analyses confirm that tet(X3) and tet(X4) are globally present in clinical bacteria-even in the same bacteria as blaNDM-1, resulting in resistance to both tigecycline and carbapenems. Our findings suggest that both the surveillance of tet(X) variants in clinical and animal sectors and the use of tetracyclines in food production require urgent global attention.

April 21, 2020

Complete genome of a marine bacterium Vibrio chagasii ECSMB14107 with the ability to infect mussels

Vibrio strains are pervasive in the aquatic environment and may form pathogenic and symbiotic relationships with the host. Vibrio chagasii ECSMB14107 was isolated from natural biofilms and is used as a model to elucidate the role of Vibrio in hard-shelled mussel (Mytilus coruscus) settlement, health and disease. The genome of the Vibrio strain ECSMB14107, comprised of two circular chromosomes that together encompass 5,549,357?bp with a mean GC content of 44.39% was determined. Knowledge about the genome of V. chagasii ECSMB14107 will provide insight into its contribution to mussel development and health.

April 21, 2020

Draft genome sequence resource of switchgrass rust pathogen, Puccinia novopanici isolate Ard-01.

Puccinia novopanici is an important biotrophic fungal pathogen that causes rust disease in switchgrass. Lack of genomic resources for P. novopanici has hampered the progress towards developing effective disease resistance against this pathogen. Therefore, we have sequenced the whole genome of P. novopanici and generated a framework to understand pathogenicity mechanisms, identify effectors, repeat element invasion, genome evolution, and comparative genomics among Puccinia species in the future. Long and short read sequences were generated from P. novopanici genomic DNA by PacBio and Illumina technologies, respectively, and assembled a 99.9 megabase (Mb) genome. Transcripts of P. novopanici were predicted from assembled genome using MAKER and were further validated by RNAseq data. The genome sequence information of P. novopanici will be a valuable resource for researchers working on monocot rusts and plant disease resistance in general.

April 21, 2020

Complete Genome of Bacillus velezensis CMT-6 and Comparative Genome Analysis Reveals Lipopeptide Diversity.

The complete genome sequence of Bacillus velezensis type strain CMT-6 is presented for the first time. A comparative analysis between the genome sequences of CMT-6 with the genome of Bacillus amyloliquefaciens DSM7T, B. velezensis FZB42, and Bacillus subtilis 168 revealed major differences in the lipopeptide synthesis genes. Of the above, only the CMT-6 strain possessed an integrated synthetase gene for synthesizing surfactin, iturin, and fengycin. However, CMT-6 shared 14, 12, and 10 other lipopeptide-producing genes with FZB42, DSM7T, and 168 respectively. The largest numbers of non-synonymous mutations were detected in 205 gene sequences that produced these three lipopeptides in CMT-6 and 168. Comparing CMT-6 with DSM7T, 58 non-synonymous mutations were detected in gene sequences that contributed to produce lipopeptides. In addition, InDels were identified in yczE and glnR genes. CMT-6 and FZB42 had the lowest number of non-synonymous mutations with 8 lipopeptide-related gene sequences. And InDels were identified in only yczE. The numbers of core genes, InDels, and non-synonymous mutations in genes were the main reasons for the differences in yield and variety of lipopeptides. These results will enrich the genomic resources available for B. velezensis and provide fundamental information to construct strains that can produce specific lipopeptides.

April 21, 2020

Integrating multiple genomic technologies to investigate an outbreak of carbapenemase-producing Enterobacter hormaechei

Carbapenem-resistant Enterobacteriaceae (CRE) represent one of the most urgent threats to human health posed by antibiotic resistant bacteria. Enterobacter hormaechei and other members of the Enterobacter cloacae complex are the most commonly encountered Enterobacter spp. within clinical settings, responsible for numerous outbreaks and ultimately poorer patient outcomes. Here we applied three complementary whole genome sequencing (WGS) technologies to characterise a hospital cluster of blaIMP-4 carbapenemase-producing E. hormaechei.In response to a suspected CRE outbreak in 2015 within an Intensive Care Unit (ICU)/Burns Unit in a Brisbane tertiary referral hospital we used Illumina sequencing to determine that all outbreak isolates were sequence type (ST)90 and near-identical at the core genome level. Comparison to publicly available data unequivocally linked all 10 isolates to a 2013 isolate from the same ward, confirming the hospital environment as the most likely original source of infection in the 2015 cases. No clonal relationship was found to IMP-4-producing isolates identified from other local hospitals. However, using Pacific Biosciences long-read sequencing we were able to resolve the complete context of the blaIMP-4 gene, which was found to be on a large IncHI2 plasmid carried by all IMP-4-producing isolates. Continued surveillance of the hospital environment was carried out using Oxford Nanopore long-read sequencing, which was able to rapidly resolve the true relationship of subsequent isolates to the initial outbreak. Shotgun metagenomic sequencing of environmental samples also found evidence of ST90 E. hormaechei and the IncHI2 plasmid within the hospital plumbing.Overall, our strategic application of three WGS technologies provided an in-depth analysis of the outbreak, including the transmission dynamics of a carbapenemase-producing E. hormaechei cluster, identification of possible hospital reservoirs and the full context of blaIMP-4 on a multidrug resistant IncHI2 plasmid that appears to be widely distributed in Australia.

April 21, 2020

Complete genome sequence and characterization of virulence genes in Lancefield group C Streptococcus dysgalactiae isolated from farmed amberjack (Seriola dumerili).

Lancefield group C Streptococcus dysgalactiae causes infections in farmed fish. Here, the genome of S. dysgalactiae strain kdys0611, isolated from farmed amberjack (Seriola dumerili) was sequenced. The complete genome sequence of kdys0611 consists of a single chromosome and five plasmids. The chromosome is 2,142,780?bp long and has a GC content of 40%. It possesses 2061 coding sequences and 67 tRNA and 6 rRNA operons. One clustered regularly interspaced short palindromic repeat, 125 insertion sequences, and four predicted prophage elements were identified. Phylogenetic analysis based on 126 core genes suggested that the kdys0611 strain is more closely related to S. dysgalactiae subsp. dysgalactiae than to S. dysgalactiae subsp. equisimilis. The genome of kdys0611 harbors 87 genes with sequence similarity to putative virulence-associated genes identified in other bacteria, of which 57 exhibit amino acid identity (>52%) to genes of the S. dysgalactiae subsp. equisimilis GGS124 human clinical isolate. Four putative virulence genes, emm5 (FGCSD_0256), spg_2 (FGCSD_1961), skc (FGCSD_1012), and cna (FGCSD_0159), in kdys0611 did not show significant homology with any deposited S. dysgalactiae genes. The chromosomal sequence of kdys0611 has been deposited in GenBank under Accession No. AP018726. This is the first report of the complete genome sequence of S. dysgalactiae isolated from fish. © 2019 The Societies and John Wiley & Sons Australia, Ltd.

April 21, 2020

Genome analysis and Hi-C assisted assembly of Elaeagnus angustifolia L., a deciduous tree belonging to Elaeagnaceae

Elaeagnus angustifolia L. is a deciduous tree of the Elaeagnaceae family. It is widely used in the study of abiotic stress tolerance in plants and for the improvement of desertification-affected land due to its characteristics of drought resistance, salt tolerance, cold resistance, wind resistance, and other environmental adaptation. Here, we report the complete genome sequencing using the Pacific Biosciences (PacBio) platform and Hi-C assisted assembly of E. angustifolia. A total of 44.27 Gb raw PacBio sequel reads were obtained after filtering out low-quality data, with an average length of 8.64 Kb. Assembly using Canu gave an assembly length of 781.09 Mb, with a contig N50 of 486.92 Kb. A total of 39.56 Gb of clean reads was obtained, with a sequencing coverage of 75×, and Q30 ratio > 95.46%. The 510.71 Mb genomic sequence was mapped to the chromosome, accounting for 96.94% of the total length of the sequence, and the corresponding number of sequences was 269, accounting for 45.83% of the total number of sequences. The genome sequence study of E. angustifolia can be a valuable source for the comparative genome analysis of the Elaeagnaceae family members, and can help to understand the evolutionary response mechanisms of the Elaeagnaceae to drought, salt, cold and wind resistance, and thereby provide effective theoretical support for the improvement of desertification-affected land.

April 21, 2020

Insect genomes: progress and challenges.

In the wake of constant improvements in sequencing technologies, numerous insect genomes have been sequenced. Currently, 1219 insect genome-sequencing projects have been registered with the National Center for Biotechnology Information, including 401 that have genome assemblies and 155 with an official gene set of annotated protein-coding genes. Comparative genomics analysis showed that the expansion or contraction of gene families was associated with well-studied physiological traits such as immune system, metabolic detoxification, parasitism and polyphagy in insects. Here, we summarize the progress of insect genome sequencing, with an emphasis on how this impacts research on pest control. We begin with a brief introduction to the basic concepts of genome assembly, annotation and metrics for evaluating the quality of draft assemblies. We then provide an overview of genome information for numerous insect species, highlighting examples from prominent model organisms, agricultural pests and disease vectors. We also introduce the major insect genome databases. The increasing availability of insect genomic resources is beneficial for developing alternative pest control methods. However, many opportunities remain for developing data-mining tools that make maximal use of the available insect genome resources. Although rapid progress has been achieved, many challenges remain in the field of insect genomics. © 2019 The Royal Entomological Society.

April 21, 2020

Complete genome of Pseudomonas sp. DMSP-1 isolated from the Arctic seawater of Kongsfjorden, Svalbard

The genus Pseudomonas is highly metabolically diverse and has colonized a wide range of ecological niches. The strain Pseudomonas sp. DMSP-1 was isolated from Arctic seawater (Kongsfjorden, Svalbard) using dimethylsulfoniopropionate (DMSP) as the sole carbon source. To better understand its role in the Arctic coastal ecosystem, the genome of Pseudomonas sp. strain DMSP-1 was completely sequenced. The genome contained a circular chromosome of 6,282,445?bp with an average GC content of 60.01?mol%. A total of 5510 protein coding genes, 70 tRNA genes and 19 rRNA genes were obtained. However, no genes encoding known enzymes associated with DMSP catabolism were identified in the genome, suggesting that novel DMSP degradation genes might exist in Pseudomonas sp. strain DMSP-1.

April 21, 2020

deSALT: fast and accurate long transcriptomic read alignment with de Bruijn graph-based index

Long-read RNA sequencing (RNA-seq) is promising to transcriptomics studies, however, the alignment of the reads is still a fundamental but non-trivial task due to the sequencing errors and complicated gene structures. We propose deSALT, a tailored two-pass long RNA-seq read alignment approach, which constructs graph-based alignment skeletons to sensitively infer exons, and use them to generate spliced reference sequence to produce refined alignments. deSALT addresses several difficult issues, such as small exons, serious sequencing errors and consensus spliced alignment. Benchmarks demonstrate that this approach has a better ability to produce high-quality full-length alignments, which has enormous potentials to transcriptomics studies.

April 21, 2020

Extended haplotype phasing of de novo genome assemblies with FALCON-Phase

Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. These assemblies can be created in various ways, such as use of tissues that contain single-haplotype (haploid) genomes, or by co-sequencing of parental genomes, but these approaches can be impractical in many situations. We present FALCON-Phase, which integrates long-read sequencing data and ultra-long-range Hi-C chromatin interaction data of a diploid individual to create high-quality, phased diploid genome assemblies. The method was evaluated by application to three datasets, including human, cattle, and zebra finch, for which high-quality, fully haplotype resolved assemblies were available for benchmarking. Phasing algorithm accuracy was affected by heterozygosity of the individual sequenced, with higher accuracy for cattle and zebra finch (>97%) compared to human (82%). In addition, scaffolding with the same Hi-C chromatin contact data resulted in phased chromosome-scale scaffolds.

Auto Tag: De novo assembly

The Chinese chestnut genome: a reference for species restoration

Soil Probiotic Utilizes Plant and Pollinator Transport for Territorial Expansion

Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline

Complete genome sequence of Paenisporosarcina antarctica CGMCC 1.6503 T, a marine psychrophilic bacterium isolated from Antarctica

Emergence of plasmid-mediated high-level tigecycline resistance genes in animals and humans.

Complete genome of a marine bacterium Vibrio chagasii ECSMB14107 with the ability to infect mussels

Draft genome sequence resource of switchgrass rust pathogen, Puccinia novopanici isolate Ard-01.

Complete Genome of Bacillus velezensis CMT-6 and Comparative Genome Analysis Reveals Lipopeptide Diversity.

Integrating multiple genomic technologies to investigate an outbreak of carbapenemase-producing Enterobacter hormaechei

Complete genome sequence and characterization of virulence genes in Lancefield group C Streptococcus dysgalactiae isolated from farmed amberjack (Seriola dumerili).

Genome analysis and Hi-C assisted assembly of Elaeagnus angustifolia L., a deciduous tree belonging to Elaeagnaceae

Complete genome of Pseudomonas sp. DMSP-1 isolated from the Arctic seawater of Kongsfjorden, Svalbard

deSALT: fast and accurate long transcriptomic read alignment with de Bruijn graph-based index

Extended haplotype phasing of de novo genome assemblies with FALCON-Phase

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert