Genome assembly Archives - Page 132 of 196

July 7, 2019

Repeated divergent selection on pigmentation genes in a rapid finch radiation.

Instances of recent and rapid speciation are suitable for associating phenotypes with their causal genotypes, especially if gene flow homogenizes areas of the genome that are not under divergent selection. We study a rapid radiation of nine sympatric bird species known as capuchino seedeaters, which are differentiated in sexually selected characters of male plumage and song. We sequenced the genomes of a phenotypically diverse set of species to search for differentiated genomic regions. Capuchinos show differences in a small proportion of their genomes, yet selection has acted independently on the same targets in different members of this radiation. Many divergent regions contain genes involved in the melanogenesis pathway, with the strongest signal originating from putative regulatory regions. Selection has acted on these same genomic regions in different lineages, likely shaping the evolution of cis-regulatory elements, which control how more conserved genes are expressed and thereby generate diversity in classically sexually selected traits.

July 7, 2019

Whole-genome sequence of Acinetobacter pittii HUMV-6483 isolated from human urine.

Acinetobacter pittii strain HUMV-6483 was obtained from urine from an adult patient. We report here its complete genome assembly using PacBio single-molecule real-time sequencing, which resulted in a chromosome with 4.07 Mb and a circular contig of 112 kb. About 3,953 protein-coding genes are predicted from this assembly. Copyright © 2017 Chapartegui-González et al.

July 7, 2019

Multiple genome sequences of Lactobacillus plantarum strains.

We report here the genome sequences of four Lactobacillus plantarum strains which vary in surface hydrophobicity. Bioinformatic analysis, using additional genomes of Lactobacillus plantarum strains, revealed a possible correlation between the cell wall teichoic acid-type and cell surface hydrophobicity and provide the basis for consecutive analyses. Copyright © 2017 Kafka et al.

July 7, 2019

Genomics and comparative genomic analyses provide insight into the taxonomy and pathogenic potential of novel Emmonsia pathogens.

Over the last 50 years, newly described species of Emmonsia-like fungi have been implicated globally as sources of systemic human mycosis (emmonsiosis). Their ability to convert into yeast-like cells capable of replication and extra-pulmonary dissemination during the course of infection differentiates them from classical Emmonsia species. Immunocompromised patients are at highest risk of emmonsiosis and exhibit high mortality rates. In order to investigate the molecular basis for pathogenicity of the newly described Emmonsia species, genomic sequencing and comparative genomic analyses of Emmonsia sp. 5z489, which was isolated from a non-deliberately immunosuppressed diabetic patient in China and represents a novel seventh isolate of Emmonsia-like fungi, was performed. The genome size of 5z489 was 35.5 Mbp in length, which is ~5 Mbp larger than other Emmonsia strains. Further, 9,188 protein genes were predicted in the 5z489 genome and 16% of the assembly was identified as repetitive elements, which is the largest abundance in Emmonsia species. Phylogenetic analyses based on whole genome data classified 5z489 and CAC-2015a, another novel isolate, as members of the genus Emmonsia. Our analyses showed that divergences among Emmonsia occurred much earlier than other genera within the family Ajellomycetaceae, suggesting relatively distant evolutionary relationships among the genus. Through comparisons of Emmonsia species, we discovered significant pathogenicity characteristics within the genus as well as putative virulence factors that may play a role in the infection and pathogenicity of the novel Emmonsia strains. Moreover, our analyses revealed a novel distribution mode of DNA methylation patterns across the genome of 5z489, with >50% of methylated bases located in intergenic regions. These methylation patterns differ considerably from other reported fungi, where most methylation occurs in repetitive loci. It is unclear if this difference is related to physiological adaptations of new Emmonsia, but this question warrants further investigation. Overall, our analyses provide a framework from which to further study the evolutionary dynamics of Emmonsia strains and identity the underlying molecular mechanisms that determine the infectious and pathogenic potency of these fungal pathogens, and also provide insight into potential targets for therapeutic intervention of emmonsiosis and further research.

July 7, 2019

A high-coverage draft genome of the mycalesine butterfly Bicyclus anynana.

The mycalesine butterfly Bicyclus anynana , the ‘Squinting bush brown’, is a model organism in the study of lepidopteran ecology, development and evolution. Here, we present a draft genome sequence for B. anynana to serve as a genomics resource for current and future studies of this important model species.Seven libraries with insert sizes ranging from 350 bp to 20 kb were constructed using DNA from an inbred female and sequenced using both Illumina and PacBio technology. 128 Gb raw Illumina data were filtered to 124 Gb and assembled to a final size of 475 Mb (~260X assembly coverage). Contigs were scaffolded using mate-pair, transcriptome and PacBio data into 10,800 sequences with an N50 of 638 kb (longest scaffold 5 Mb). The genome is comprised of 26% repetitive elements, and encodes a total of 22,642 predicted protein-coding genes. Recovery of a BUSCO set of core metazoan genes was almost complete (98%). Overall, these metrics compare well with other recently published lepidopteran genomes.We report a high-quality draft genome sequence for Bicyclus anynana . The genome assembly and annotated gene models are available at LepBase ( http://ensembl.lepbase.org/index.html ).

July 7, 2019

Genome graphs

There is increasing recognition that a single, monoploid reference genome is a poor universal reference structure for human genetics, because it represents only a tiny fraction of human variation. Adding this missing variation results in a structure that can be described as a mathematical graph: a genome graph. We demonstrate that, in comparison to the existing reference genome (GRCh38), genome graphs can substantially improve the fractions of reads that map uniquely and perfectly. Furthermore, we show that this fundamental simplification of read mapping transforms the variant calling problem from one in which many non-reference variants must be discovered de-novo to one in which the vast majority of variants are simply re-identified within the graph. Using standard benchmarks as well as a novel reference-free evaluation, we show that a simplistic variant calling procedure on a genome graph can already call variants at least as well as, and in many cases better than, a state-of-the-art method on the linear human reference genome. We anticipate that graph-based references will supplant linear references in humans and in other applications where cohorts of sequenced individuals are available.

July 7, 2019

Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.

Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

July 7, 2019

Whole genome sequencing and analysis of Campylobacter coli YH502 from retail chicken reveals a plasmid-borne type VI secretion system.

Campylobacter is a major cause of foodborne illnesses worldwide. Campylobacter infections, commonly caused by ingestion of undercooked poultry and meat products, can lead to gastroenteritis and chronic reactive arthritis in humans. Whole genome sequencing (WGS) is a powerful technology that provides comprehensive genetic information about bacteria and is increasingly being applied to study foodborne pathogens: e.g., evolution, epidemiology/outbreak investigation, and detection. Herein we report the complete genome sequence of Campylobacter coli strain YH502 isolated from retail chicken in the United States. WGS, de novo assembly, and annotation of the genome revealed a chromosome of 1,718,974 bp and a mega-plasmid (pCOS502) of 125,964 bp. GC content of the genome was 31.2% with 1931 coding sequences and 53 non-coding RNAs. Multiple virulence factors including a plasmid-borne type VI secretion system and antimicrobial resistance genes (beta-lactams, fluoroquinolones, and aminoglycoside) were found. The presence of T6SS in a mobile genetic element (plasmid) suggests plausible horizontal transfer of these virulence genes to other organisms. The C. coli YH502 genome also harbors CRISPR sequences and associated proteins. Phylogenetic analysis based on average nucleotide identity and single nucleotide polymorphisms identified closely related C. coli genomes available in the NCBI database. Taken together, the analyzed genomic data of this potentially virulent strain of C. coli will facilitate further understanding of this important foodborne pathogen most likely leading to better control strategies. The chromosome and plasmid sequences of C. coli YH502 have been deposited in GenBank under the accession numbers CP018900.1 and CP018901.1, respectively.

July 7, 2019

The origin, diversification and adaptation of a major mangrove clade (Rhizophoreae) revealed by whole-genome sequencing

Mangroves invade some very marginal habitats for woody plants—at the interface between land and sea. Since mangroves anchor tropical coastal communities globally, their origin, diversification and adaptation are of scientific significance, particularly at a time of global climate change. In this study, a combination of single-molecule long reads and the more conventional short reads are generated from Rhizophora apiculata for the de novo assembly of its genome to a near chromosome level. The longest scaffold, N50 and N90 for the R. apiculata genome, are 13.3 Mb, 5.4 Mb and 1.0 Mb, respectively. Short reads for the genomes and transcriptomes of eight related species are also generated. We find that the ancestor of Rhizophoreae experienced a whole-genome duplication ~70 Myrs ago, which is followed rather quickly by colonization and species diversification. Mangroves exhibit pan-exome modifications of amino acid (AA) usage as well as unusual AA substitutions among closely related species. The usage and substitution of AAs, unique among plants surveyed, is correlated with the rapid evolution of proteins in mangroves. A small subset of these substitutions is associated with mangroves’ highly specialized traits (vivipary and red bark) thought to be adaptive in the intertidal habitats. Despite the many adaptive features, mangroves are among the least genetically diverse plants, likely the result of continual habitat turnovers caused by repeated rises and falls of sea level in the geologically recent past. Mangrove genomes thus inform about their past evolutionary success as well as portend a possibly difficult future.

July 7, 2019

Complete genome of a metabolically-diverse marine bacterium Shewanella japonica KCTC 22435T.

Shewanella japonica KCTC 22435Tis a facultatively anaerobic, Gram-negative, mesophilic, rod-shaped bacterium isolated from sea water at the Pacific Institute of Bio-organic Chemistry of the Marine Experimental Station, Troitza Bay, Gulf of Peter the Great, Russia. Here, we report the complete genome of S. japonica KCTC 22435T, which consists of 4,975,677bp (G+C content of 40.80%) with a single chromosome, 4036 protein-coding genes, 97 tRNAs and 8 rRNA operons. Genes detected in the genome reveal that the strain possesses a type II secretion system, cytochrome c family proteins with various numbers of heme-binding motifs, and metabolic pathways for utilizing diverse carbon sources, supporting the potential of KCTC 22435Tto generate electricity in salinity culture conditions. Copyright © 2017 Elsevier B.V. All rights reserved.

July 7, 2019

Comparative genomics of all three Campylobacter sputorum biovars and a novel cattle-associated C. sputorum clade.

Campylobacter sputorum is a non-thermotolerant campylobacter that is primarily isolated from food animals such as cattle and sheep. C. sputorum is also infrequently associated with human illness. Based on catalase and urease activity, three biovars are currently recognized within C. sputorum: bv. sputorum (catalase negative, urease negative), bv. fecalis (catalase positive, urease negative), and bv. paraureolyticus (catalase negative, urease positive). A multi-locus sequence typing (MLST) method was recently constructed for C. sputorum. MLST typing of several cattle-associated C. sputorum isolates suggested that they are members of a divergent C. sputorum clade. Although catalase positive, and thus technically bv. fecalis, the taxonomic position of these strains could not be determined solely by MLST. To further characterize C. sputorum, the genomes of four strains, representing all three biovars and the divergent clade, were sequenced to completion. Here we present a comparative genomic analysis of the four C. sputorum genomes. This analysis indicates that the three biovars and the cattle-associated strains are highly-related at the genome level with similarities in gene content. Furthermore, the four genomes are strongly syntenic with one or two minor inversions. However, substantial differences in gene content were observed among the three biovars. Finally, although the strain representing the cattle-associated isolates was shown to be C. sputorum, it is possible that this strain is a member of a novel C. sputorum subspecies; thus, these cattle-associated strains may form a second taxon within C. sputorum. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.

July 7, 2019

Complete genome sequence of Planococcus donghaensis JH1(T), a pectin-degrading bacterium.

The type strain Planococcus donghaensis JH1(T) is a psychrotolerant and halotolerant bacterium with starch-degrading ability. Here, we determine the carbon utilization profile of P. donghaensis JH1(T) and report the first complete genome of the strain. This study revealed the strain’s ability to utilize pectin and d-galacturonic acid, and identified genes responsible for degradation of the polysaccharides. The genomic information provided may serve as a fundamental resource for full exploration of the biotechnological potential of P. donghaensis JH1(T). Copyright © 2017. Published by Elsevier B.V.

July 7, 2019

Complete genome sequence of the drought resistance-promoting endophyte Klebsiella sp. LTGPAF-6F.

Bacterial endophytes with capacity to promote plant growth and improve plant tolerance against biotic and abiotic stresses have importance in agricultural practice and phytoremediation. A plant growth-promoting endophyte named Klebsiella sp. LTGPAF-6F, which was isolated from the roots of the desert plant Alhagi sparsifolia in north-west China, exhibits the ability to enhance the growth of wheat under drought stress. The complete genome sequence of this strain consists of one circular chromosome and two circular plasmids. From the genome, we identified genes related to the plant growth promotion and stress tolerance, such as nitrogen fixation, production of indole-3-acetic acid, acetoin, 2,3-butanediol, spermidine and trehalose. This genome sequence provides a basis for understanding the beneficial interactions between LTGPAF-6F and host plants, and will facilitate its applications as biotechnological agents in agriculture. Copyright © 2017 Elsevier B.V. All rights reserved.

July 7, 2019

Complete genome sequence of a natural compounds producer, Streptomyces violaceus S21.

The complete genome sequence of Streptomyces violaceus strain S21, a valuable natural compounds producer isolated from the forest soil, is firstly presented here. The genome comprised 7.91M bp, with a G + C content of 72.65%. A range of genes involved in pathways of secondary product biosynthesis were predicted. The genome sequence is available at DDBJ/EMBL/Genbank under the accession number CP020570. This genome is annotated with 6856 predicted genes identifying the natural product biosynthetic gene clusters in S. violaceus.

July 7, 2019

Whole genome characterization of a naturally occurring vancomycin-dependent Enterococcus faecium from a patient with bacteremia.

Vancomycin-dependent enterococci are a relatively uncommon phenotype recovered in the clinical laboratory. Recognition and recovery of these isolates are important, to provide accurate identification and susceptibility information to treating physicians. Herein, we describe the recovery of a vancomycin-dependent and revertant E. faecium isolates harboring vanB operon from a patient with bacteremia. Using whole genome sequencing, we found a unique single nucleotide polymorphism (S186N) in the D-Ala-D-Ala ligase (ddl) conferring vancomycin-dependency. Additionally, we found that a majority of in vitro revertants mutated outside ddl, with some strains harboring mutations in vanS, while others likely containing novel mechanisms of reversion. Copyright © 2017 Elsevier B.V. All rights reserved.

Auto Tag: Genome assembly

Repeated divergent selection on pigmentation genes in a rapid finch radiation.

Whole-genome sequence of Acinetobacter pittii HUMV-6483 isolated from human urine.

Multiple genome sequences of Lactobacillus plantarum strains.

Genomics and comparative genomic analyses provide insight into the taxonomy and pathogenic potential of novel Emmonsia pathogens.

A high-coverage draft genome of the mycalesine butterfly Bicyclus anynana.

Genome graphs

Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.

Whole genome sequencing and analysis of Campylobacter coli YH502 from retail chicken reveals a plasmid-borne type VI secretion system.

The origin, diversification and adaptation of a major mangrove clade (Rhizophoreae) revealed by whole-genome sequencing

Complete genome of a metabolically-diverse marine bacterium Shewanella japonica KCTC 22435T.

Comparative genomics of all three Campylobacter sputorum biovars and a novel cattle-associated C. sputorum clade.

Complete genome sequence of Planococcus donghaensis JH1(T), a pectin-degrading bacterium.

Complete genome sequence of the drought resistance-promoting endophyte Klebsiella sp. LTGPAF-6F.

Complete genome sequence of a natural compounds producer, Streptomyces violaceus S21.

Whole genome characterization of a naturally occurring vancomycin-dependent Enterococcus faecium from a patient with bacteremia.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert