Hybrid assembly Archives - Page 19 of 30

July 7, 2019

Genome sequence of enterotoxigenic Escherichia coli strain FMU073332.

Enterotoxigenic Escherichia coli (ETEC) is an important cause of bacterial diarrheal illness, affecting practically every population worldwide, and was estimated to cause 120,800 deaths in 2010. Here, we report the genome sequence of ETEC strain FMU073332, isolated from a 25-month-old girl from Tlaltizapán, Morelos, México. Copyright © 2017 Saldaña-Ahuactzi et al.

July 7, 2019

Complete genome sequence of Edwardsiella hoshinae ATCC 35051.

Edwardsiella hoshinae is a Gram-negative facultative anaerobe that has primarily been isolated from avians and reptiles. We report here the complete and annotated genome sequence of an isolate from a monitor lizard (Varanus sp.), which contains a chromosome of 3,811,650 bp and no plasmids. Copyright © 2017 Reichley et al.

July 7, 2019

Draft genome sequence of Karnal bunt pathogen (Tilletia indica) of wheat provides insights into the pathogenic mechanisms of quarantined fungus.

Karnal bunt disease in wheat is caused by hemibiotrophic fungus, Tilletia indica that has been placed as quarantine pest in more than 70 countries. Despite its economic importance, little knowledge about the molecular components of fungal pathogenesis is known. In this study, first time the genome sequence of T. indica has been deciphered for unraveling the effectors’ functions of molecular pathogenesis of Karnal bunt disease. The T. indica genome was sequenced employing hybrid approach of PacBio Single Molecule Real Time (SMRT) and Illumina HiSEQ 2000 sequencing platforms. The genome was assembled into 10,957 contigs (N50 contig length 3 kb) with total size of 26.7 Mb and GC content of 53.99%. The number of predicted putative genes were 11,535, which were annotated with Gene Ontology databases. Functional annotation of Karnal bunt pathogen genome and classification of identified effectors into protein families revealed interesting functions related to pathogenesis. Search for effectors’ genes using pathogen host interaction database identified 135 genes. The T. indica genome sequence and putative genes involved in molecular pathogenesis would further help in devising novel and effective disease management strategies including development of resistant wheat genotypes, novel biomarkers for pathogen detection and new targets for fungicide development.

July 7, 2019

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod.

The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies.By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual.The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.

July 7, 2019

Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm.

Long sequencing reads generated by single-molecule sequencing technology offer the possibility of dramatically improving the contiguity of genome assemblies. The biggest challenge today is that long reads have relatively high error rates, currently around 15%. The high error rates make it difficult to use this data alone, particularly with highly repetitive plant genomes. Errors in the raw data can lead to insertion or deletion errors (indels) in the consensus genome sequence, which in turn create significant problems for downstream analysis; for example, a single indel may shift the reading frame and incorrectly truncate a protein sequence. Here, we describe an algorithm that solves the high error rate problem by combining long, high-error reads with shorter but much more accurate Illumina sequencing reads, whose error rates average <1%. Our hybrid assembly algorithm combines these two types of reads to construct mega-reads, which are both long and accurate, and then assembles the mega-reads using the CABOG assembler, which was designed for long reads. We apply this technique to a large data set of Illumina and PacBio sequences from the species Aegilops tauschii, a large and extremely repetitive plant genome that has resisted previous attempts at assembly. We show that the resulting assembled contigs are far larger than in any previous assembly, with an N50 contig size of 486,807 nucleotides. We compare the contigs to independently produced optical maps to evaluate their large-scale accuracy, and to a set of high-quality bacterial artificial chromosome (BAC)-based assemblies to evaluate base-level accuracy. © 2017 Zimin et al.; Published by Cold Spring Harbor Laboratory Press.

July 7, 2019

Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism.

Plasmopara viticola causes downy mildew disease of grapevine which is one of the most devastating diseases of viticulture worldwide. Here we report a 101.3?Mb whole genome sequence of P. viticola isolate ‘JL-7-2’ obtained by a combination of Illumina and PacBio sequencing technologies. The P. viticola genome contains 17,014 putative protein-coding genes and has ~26% repetitive sequences. A total of 1,301 putative secreted proteins, including 100 putative RXLR effectors and 90 CRN effectors were identified in this genome. In the secretome, 261 potential pathogenicity genes and 95 carbohydrate-active enzymes were predicted. Transcriptional analysis revealed that most of the RXLR effectors, pathogenicity genes and carbohydrate-active enzymes were significantly up-regulated during infection. Comparative genomic analysis revealed that P. viticola evolved independently from the Arabidopsis downy mildew pathogen Hyaloperonospora arabidopsidis. The availability of the P. viticola genome provides a valuable resource not only for comparative genomic analysis and evolutionary studies among oomycetes, but also enhance our knowledge on the mechanism of interactions between this biotrophic pathogen and its host.

July 7, 2019

An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing.

The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25?361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107?821, 61% larger than the previous assembly. © The Author 2017. Published by Oxford University Press.

July 7, 2019

Characterization of Class IIa bacteriocin resistance in Enterococcus faecium.

Vancomycin-resistant enterococci, particularly resistant Enterococcus faecium, pose an escalating threat in nosocomial environments because of their innate resistance to many antibiotics, including vancomycin, a treatment of last resort. Many class IIa bacteriocins strongly target these enterococci and may offer a potential alternative for the management of this pathogen. However, E. faecium’s resistance to these peptides remains relatively uncharacterized. Here, we explored the development of resistance of E. faecium to a cocktail of three class IIa bacteriocins: enterocin A, enterocin P, and hiracin JM79. We started by quantifying the frequency of resistance to these peptides in four clinical isolates of E. faecium We then investigated the levels of resistance of E. faecium 6E6 mutants as well as their fitness in different carbon sources. In order to elucidate the mechanism of resistance of E. faecium to class IIa bacteriocins, we completed whole-genome sequencing of resistant mutants and performed reverse transcription-quantitative PCR (qRT-PCR) of a suspected target mannose phosphotransferase (ManPTS). We then verified this ManPTS’s role in bacteriocin susceptibility by showing that expression of the ManPTS in Lactococcus lactis results in susceptibility to the peptide cocktail. Based on the evidence found from these studies, we conclude that, in accord with other studies in E. faecalis and Listeria monocytogenes, resistance to class IIa bacteriocins in E. faecium 6E6 is likely caused by the disruption of a particular ManPTS, which we believe we have identified. Copyright © 2017 American Society for Microbiology.

July 7, 2019

De novo genome and transcriptome assembly of the Canadian beaver (Castor canadensis).

The Canadian beaver (Castor canadensis) is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 ×) long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 ×) and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon-gene models derived from 9805 full-length open reading frames (FL-ORFs) constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs) gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology. Copyright © 2017 Lok et al.

July 7, 2019

Genome sequence of the fungal strain 14919 producing 3-hydroxy-3-methylglutaryl–coenzyme A reductase inhibitor FR901512.

Fungal strain 14919 was originally isolated from a soil sample collected at Mt. Kiyosumi, Chiba Prefecture, Japan. It produces FR901512, a potent and strong 3-hydroxy-3-methylglutaryl-coenzyme A (HMG-CoA) reductase inhibitor. The genome sequence of fungal strain 14919 was determined and annotated to improve the productivity of FR901512. Copyright © 2017 Itoh et al.

July 7, 2019

Genome sequences of Cyberlindnera fabianii 65, Pichia kudriavzevii 129, and Saccharomyces cerevisiae 131 isolated from fermented masau fruits in Zimbabwe.

Cyberlindnera fabianii 65, Pichia kudriavzevii 129, and Saccharomyces cerevisiae 131 have been isolated from the microbiota of fermented masau fruits. C. fabianii and P. kudriavzevii especially harbor promising features for biotechnology and food applications. Here, we present the draft annotated genome sequences of these isolates. Copyright © 2017 van Rijswijck et al.

July 7, 2019

Complete genome sequences of two Staphylococcus aureus sequence type 5 isolates from California, USA.

Staphylococcus aureus causes a variety of human diseases ranging in severity. The pathogenicity of S. aureus can be partially attributed to the acquisition of mobile genetic elements. In this report, we provide two complete genome sequences from human clinical S. aureus isolates. Copyright © 2017 Hau et al.

July 7, 2019

Complete genome sequence of Amycolatopsis orientalis CPCC200066, the producer of norvancomycin.

Amycolatopsis orientalis CPCC200066 is an actinomycete exploited commercially in China for the production of norvancomycin, an important glycopeptide antibiotic structurally close to the well-known vancomycin. The availability of the complete genome sequence of CPCC200066 would greatly strengthen our understanding of the regulation pattern of norvancomycin biosynthesis and ultimately improve its production, as well as potentiate discoveries of novel bioactive compounds. Here we report the complete genome sequence of A. orientalis CPCC200066, a circular chromosome consisting of 9,490,992bp. Forty putative secondary metabolite biosynthetic gene clusters, including norvancomycin, were predicted, covering 20.3% of the whole genome. To facilitate genetic manipulation of this strain, an efficient transformation system was established by constructing a novel integrative vector pIMBT1, which could be transferred into CPCC200066 by electroporation with high efficiency. FBT1 attB sites were also identified in other known Amycolatopsis genomes, indicating pIMBT1’s prospect to be a novel vector for genus Amycolatopsis. Copyright © 2017 Elsevier B.V. All rights reserved.

July 7, 2019

High-quality de novo assembly of the apple genome and methylome dynamics of early fruit development.

Using the latest sequencing and optical mapping technologies, we have produced a high-quality de novo assembly of the apple (Malus domestica Borkh.) genome. Repeat sequences, which represented over half of the assembly, provided an unprecedented opportunity to investigate the uncharacterized regions of a tree genome; we identified a new hyper-repetitive retrotransposon sequence that was over-represented in heterochromatic regions and estimated that a major burst of different transposable elements (TEs) occurred 21 million years ago. Notably, the timing of this TE burst coincided with the uplift of the Tian Shan mountains, which is thought to be the center of the location where the apple originated, suggesting that TEs and associated processes may have contributed to the diversification of the apple ancestor and possibly to its divergence from pear. Finally, genome-wide DNA methylation data suggest that epigenetic marks may contribute to agronomically relevant aspects, such as apple fruit development.

July 7, 2019

Complete genome sequence of Staphylococcus epidermidis 1457.

Staphylococcus epidermidis 1457 is a frequently utilized strain that is amenable to genetic manipulation and has been widely used for biofilm-related research. We report here the whole-genome sequence of this strain, which encodes 2,277 protein-coding genes and 81 RNAs within its 2.4-Mb genome and plasmid. Copyright © 2017 Galac et al.

Asset Tag: Hybrid assembly

Genome sequence of enterotoxigenic Escherichia coli strain FMU073332.

Complete genome sequence of Edwardsiella hoshinae ATCC 35051.

Draft genome sequence of Karnal bunt pathogen (Tilletia indica) of wheat provides insights into the pathogenic mechanisms of quarantined fungus.

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod.

Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm.

Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism.

An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing.

Characterization of Class IIa bacteriocin resistance in Enterococcus faecium.

De novo genome and transcriptome assembly of the Canadian beaver (Castor canadensis).

Genome sequence of the fungal strain 14919 producing 3-hydroxy-3-methylglutaryl–coenzyme A reductase inhibitor FR901512.

Genome sequences of Cyberlindnera fabianii 65, Pichia kudriavzevii 129, and Saccharomyces cerevisiae 131 isolated from fermented masau fruits in Zimbabwe.

Complete genome sequences of two Staphylococcus aureus sequence type 5 isolates from California, USA.

Complete genome sequence of Amycolatopsis orientalis CPCC200066, the producer of norvancomycin.

High-quality de novo assembly of the apple genome and methylome dynamics of early fruit development.

Complete genome sequence of Staphylococcus epidermidis 1457.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert