NCBI Archives - Page 57 of 58

September 21, 2019

Functional analysis of the first complete genome sequence of a multidrug resistant sequence type 2 Staphylococcus epidermidis.

Staphylococcus epidermidis is a significant opportunistic pathogen of humans. The ST2 lineage is frequently multidrug resistant and accounts for most of the clinical disease worldwide. However, there are no publically available, closed ST2 genomes and pathogenesis studies have not focused on these strains. We report the complete genome and methylome of BPH0662, a multidrug resistant, hospital adapted, ST2 S. epidermidis, and describe the correlation between resistome and phenotype, as well as demonstrate its relationship to publically available, international ST2 isolates. Furthermore, we delineate the methylome determined by the two type I restriction modification systems present in BPH0662 through heterologous expression in Escherichia coli, allowing the assignment of each system to its corresponding target recognition motif. As the first complete ST2 S. epidermidis genome, BPH0662 provides a valuable reference for future genomic studies of this clinically relevant lineage. Defining the methylome and the construction of these E. coli hosts provides the foundation for the development of molecular tools to bypass restriction modification systems in this lineage that has hitherto proven intractable.

September 21, 2019

Multiple genome sequences of important beer-spoiling lactic acid bacteria.

Seven strains of important beer-spoiling lactic acid bacteria were sequenced using single-molecule real-time sequencing. Complete genomes were obtained for strains of Lactobacillus paracollinoides, Lactobacillus lindneri, and Pediococcus claussenii The analysis of these genomes emphasizes the role of plasmids as the genomic foundation of beer-spoiling ability. Copyright © 2016 Geissler et al.

September 21, 2019

Complete chloroplast genome sequence of the red silk cotton tree (Bombax ceiba)

Bombax ceiba L. is a beautiful and deciduous tree with great ecological and economic importance. The third generation sequencing of chloroplast genome of B. ceiba was conducted on the PacBio sequencing platform (Pacific Biosciences). The complete chloroplast genome was 158,997?bp, which contains a large single-copy (LSC) region (89,021?bp), a small single-copy (SSC) region (21,110?bp), and two inverted repeats (IRs) (24,433?bp). In total, 116 genes were annotated, including 81 protein-coding genes, eight rRNA genes, and 27 tRNA genes. The phylogenetic tree showed that B. ceiba was closely clustered with one clade of Malvaceae.

September 21, 2019

Comparative genomics of enterohemorrhagic Escherichia coli O145:H28 demonstrates a common evolutionary lineage with Escherichia coli O157:H7.

Although serotype O157:H7 is the predominant enterohemorrhagic Escherichia coli (EHEC), outbreaks of non-O157 EHEC that cause severe foodborne illness, including hemolytic uremic syndrome have increased worldwide. In fact, non-O157 serotypes are now estimated to cause over half of all the Shiga toxin-producing Escherichia coli (STEC) cases, and outbreaks of non-O157 EHEC infections are frequently associated with serotypes O26, O45, O103, O111, O121, and O145. Currently, there are no complete genomes for O145 in public databases.We determined the complete genome sequences of two O145 strains (EcO145), one linked to a US lettuce-associated outbreak (RM13514) and one to a Belgium ice-cream-associated outbreak (RM13516). Both strains contain one chromosome and two large plasmids, with genome sizes of 5,737,294 bp for RM13514 and 5,559,008 bp for RM13516. Comparative analysis of the two EcO145 genomes revealed a large core (5,173 genes) and a considerable amount of strain-specific genes. Additionally, the two EcO145 genomes display distinct chromosomal architecture, virulence gene profile, phylogenetic origin of Stx2a prophage, and methylation profile (methylome). Comparative analysis of EcO145 genomes to other completely sequenced STEC and other E. coli and Shigella genomes revealed that, unlike any other known non-O157 EHEC strain, EcO145 ascended from a common lineage with EcO157/EcO55. This evolutionary relationship was further supported by the pangenome analysis of the 10 EHEC str ains. Of the 4,192 EHEC core genes, EcO145 shares more genes with EcO157 than with the any other non-O157 EHEC strains.Our data provide evidence that EcO145 and EcO157 evolved from a common lineage, but ultimately each serotype evolves via a lineage-independent nature to EHEC by acquisition of the core set of EHEC virulence factors, including the genes encoding Shiga toxin and the large virulence plasmid. The large variation between the two EcO145 genomes suggests a distinctive evolutionary path between the two outbreak strains. The distinct methylome between the two EcO145 strains is likely due to the presence of a BsuBI/PstI methyltransferase gene cassette in the Stx2a prophage of the strain RM13514, suggesting a role of horizontal gene transfer-mediated epigenetic alteration in the evolution of individual EHEC strains.

September 21, 2019

Whole genome sequence of the soybean aphid, Aphis glycines.

Aphids are emerging as model organisms for both basic and applied research. Of the 5,000 estimated species, only three aphids have published whole genome sequences: the pea aphid Acyrthosiphon pisum, the Russian wheat aphid, Diuraphis noxia, and the green peach aphid, Myzus persicae. We present the whole genome sequence of a fourth aphid, the soybean aphid (Aphis glycines), which is an extreme specialist and an important invasive pest of soybean (Glycine max). The availability of genomic resources is important to establish effective and sustainable pest control, as well as to expand our understanding of aphid evolution. We generated a 302.9 Mbp draft genome assembly for Ap. glycines using a hybrid sequencing approach. This assembly shows high completeness with 19,182 predicted genes, 92% of known Ap. glycines transcripts mapping to contigs, and substantial continuity with a scaffold N50 of 174,505 bp. The assembly represents 95.5% of the predicted genome size of 317.1 Mbp based on flow cytometry. Ap. glycines contains the smallest known aphid genome to date, based on updated genome sizes for 19 aphid species. The repetitive DNA content of the Ap. glycines genome assembly (81.6 Mbp or 26.94% of the 302.9 Mbp assembly) shows a reduction in the number of classified transposable elements compared to Ac. pisum, and likely contributes to the small estimated genome size. We include comparative analyses of gene families related to host-specificity (cytochrome P450’s and effectors), which may be important in Ap. glycines evolution. This Ap. glycines draft genome sequence will provide a resource for the study of aphid genome evolution, their interaction with host plants, and candidate genes for novel insect control methods. Copyright © 2017 Elsevier Ltd. All rights reserved.

September 21, 2019

Retrotransposons are the major contributors to the expansion of the Drosophila ananassae Muller F element.

The discordance between genome size and the complexity of eukaryotes can partly be attributed to differences in repeat density. The Muller F element (~5.2 Mb) is the smallest chromosome in Drosophila melanogaster, but it is substantially larger (>18.7 Mb) in D. ananassae To identify the major contributors to the expansion of the F element and to assess their impact, we improved the genome sequence and annotated the genes in a 1.4-Mb region of the D. ananassae F element, and a 1.7-Mb region from the D element for comparison. We find that transposons (particularly LTR and LINE retrotransposons) are major contributors to this expansion (78.6%), while Wolbachia sequences integrated into the D. ananassae genome are minor contributors (0.02%). Both D. melanogaster and D. ananassae F-element genes exhibit distinct characteristics compared to D-element genes (e.g., larger coding spans, larger introns, more coding exons, and lower codon bias), but these differences are exaggerated in D. ananassae Compared to D. melanogaster, the codon bias observed in D. ananassae F-element genes can primarily be attributed to mutational biases instead of selection. The 5′ ends of F-element genes in both species are enriched in dimethylation of lysine 4 on histone 3 (H3K4me2), while the coding spans are enriched in H3K9me2. Despite differences in repeat density and gene characteristics, D. ananassae F-element genes show a similar range of expression levels compared to genes in euchromatic domains. This study improves our understanding of how transposons can affect genome size and how genes can function within highly repetitive domains. Copyright © 2017 Leung et al.

September 21, 2019

PacBio assembly of a Plasmodium knowlesi genome sequence with Hi-C correction and manual annotation of the SICAvar gene family.

Plasmodium knowlesi has risen in importance as a zoonotic parasite that has been causing regular episodes of malaria throughout South East Asia. The P. knowlesi genome sequence generated in 2008 highlighted and confirmed many similarities and differences in Plasmodium species, including a global view of several multigene families, such as the large SICAvar multigene family encoding the variant antigens known as the schizont-infected cell agglutination proteins. However, repetitive DNA sequences are the bane of any genome project, and this and other Plasmodium genome projects have not been immune to the gaps, rearrangements and other pitfalls created by these genomic features. Today, long-read PacBio and chromatin conformation technologies are overcoming such obstacles. Here, based on the use of these technologies, we present a highly refined de novo P. knowlesi genome sequence of the Pk1(A+) clone. This sequence and annotation, referred to as the ‘MaHPIC Pk genome sequence’, includes manual annotation of the SICAvar gene family with 136 full-length members categorized as type I or II. This sequence provides a framework that will permit a better understanding of the SICAvar repertoire, selective pressures acting on this gene family and mechanisms of antigenic variation in this species and other pathogens.

September 21, 2019

The kinetoplastid-infecting Bodo saltans virus (BsV), a window into the most abundant giant viruses in the sea.

Giant viruses are ecologically important players in aquatic ecosystems that have challenged concepts of what constitutes a virus. Herein, we present the giant Bodo saltans virus (BsV), the first characterized representative of the most abundant group of giant viruses in ocean metagenomes, and the first isolate of a klosneuvirus, a subgroup of the Mimiviridae proposed from metagenomic data. BsV infects an ecologically important microzooplankton, the kinetoplastid Bodo saltans. Its 1.39 Mb genome encodes 1227 predicted ORFs, including a complex replication machinery. Yet, much of its translational apparatus has been lost, including all tRNAs. Essential genes are invaded by homing endonuclease-encoding self-splicing introns that may defend against competing viruses. Putative anti-host factors show extensive gene duplication via a genomic accordion indicating an ongoing evolutionary arms race and highlighting the rapid evolution and genomic plasticity that has led to genome gigantism and the enigma that is giant viruses.© 2018, Deeg et al.

September 21, 2019

Chromulinavorax destructans, a pathogenic TM6 bacterium with an unusual replication strategy targeting protist mitochondrion

Most of the diversity of microbial life is not available in culture, and as such we lack even a fundamental understanding of the biological diversity of several branches on the tree of life. One branch that is highly underrepresented is the candidate phylum TM6, also known as the Dependentiae. Their biology is known only from reduced genomes recovered from metagenomes around the world and two isolates infecting amoebae, all suggest that they live highly host-associated lifestyles as parasites or symbionts. Chromulinavorax destructans is an isolate from the TM6/Dependentiae that infects and lyses the abundant heterotrophic flagellate, Spumella elongata. Chromulinavorax destructans is characterized by a high degree of reduction and specialization for infection, so much so it was discovered in a screen for giant viruses. Its 1.2 Mb genome shows no metabolic potential and C. destructans instead relies on extensive transporter system to import nutrients, and even energy in the form of ATP from the host. Accordingly, it replicates in a viral-like fashion, while extensively reorganizing and expanding the host mitochondrion. 44% of proteins contain signal sequences for secretion, which includes many proteins of unknown function as well as 98 copies of ankyrin-repeat domain proteins, known effectors of host modulation, suggesting the presence of an extensive host-manipulation apparatus.

September 21, 2019

Multi-Locus Variable number of tandem repeat Analysis (MLVA) of Yersinia ruckeri confirms the existence of host-specificity, geographic endemism and anthropogenic dissemination of virulent clones.

A Multi-Locus Variable number of tandem repeat Analysis (MLVA) assay was developed for epizootiological study of the internationally significant fish pathogen Yersinia ruckeri, which causes yersiniosis in salmonids. The assay involves amplification of ten Variable Number of Tandem Repeat (VNTR) loci in two five-plex PCR reactions, followed by capillary electrophoresis. A collection of 484 Y. ruckeri isolates, originating from various biological sources and collected from four continents over seven decades, was analysed. Minimum spanning tree cluster analysis of MLVA profiles separated the studied population into nine major clonal complexes, and a number of minor clusters and singletons. The major clonal complexes could be associated with host species, geographic origin and serotype. A single large clonal complex of serotype O1 isolates dominating the yersiniosis situation in international rainbow trout farming suggests anthropogenic spread of this clone, possibly related to transport of fish. Moreover, sub-clustering within this clonal complex indicates putative transmission routes and multiple biotype shift events. In contrast to the situation in rainbow trout, Y. ruckeri strains associated with disease in Atlantic salmon appear as more or less geographically isolated clonal complexes. A single complex of serotype O1 exclusive to Norway was found to be responsible for almost all major yersiniosis outbreaks in modern Norwegian salmon farming, and site-specific sub-clustering further indicates persistent colonisation of freshwater farms in Norway. Identification of genetically diverse Y. ruckeri isolates from clinically healthy fish and environmental sources also suggests the widespread existence of less virulent or avirulent strains.Importance This comprehensive population study substantially improves our understanding of the epizootiological history and nature of an internationally important fish pathogenic bacterium. The MLVA assay developed and presented represents a high-resolution typing tool particularly well suited for Yersinia ruckeri infection tracing, selection of strains for vaccine inclusion, and risk assessment. The ability of the assay to separate isolates into geographically linked and/or possibly host-specific clusters reflects its potential utility for maintenance of national biosecurity. The MLVA is internationally applicable, robust, and provides clear, unambiguous and easily interpreted results. Typing is reasonably inexpensive, with a moderate technological requirement, and may be completed from a harvested colony within a single working day. As the resulting MLVA profiles are readily portable, any Y. ruckeri strain may rapidly be placed in a global epizootiological context. Copyright © 2018 Gulla et al.

September 21, 2019

Assessing genome assembly quality using the LTR Assembly Index (LAI).

Assembling a plant genome is challenging due to the abundance of repetitive sequences, yet no standard is available to evaluate the assembly of repeat space. LTR retrotransposons (LTR-RTs) are the predominant interspersed repeat that is poorly assembled in draft genomes. Here, we propose a reference-free genome metric called LTR Assembly Index (LAI) that evaluates assembly continuity using LTR-RTs. After correcting for LTR-RT amplification dynamics, we show that LAI is independent of genome size, genomic LTR-RT content, and gene space evaluation metrics (i.e., BUSCO and CEGMA). By comparing genomic sequences produced by various sequencing techniques, we reveal the significant gain of assembly continuity by using long-read-based techniques over short-read-based methods. Moreover, LAI can facilitate iterative assembly improvement with assembler selection and identify low-quality genomic regions. To apply LAI, intact LTR-RTs and total LTR-RTs should contribute at least 0.1% and 5% to the genome size, respectively. The LAI program is freely available on GitHub: https://github.com/oushujun/LTR_retriever.

September 21, 2019

Divergent selection causes whole genome differentiation without physical linkage among the targets in Spodoptera frugiperda (Noctuidae)

The process of speciation involves whole genome differentiation by overcoming gene flow between diverging populations. We have ample knowledge which evolutionary forces may cause genomic differentiation, and several speciation models have been proposed to explain the transition from genetic to genomic differentiation. However, it is still unclear what are critical conditions enabling genomic differentiation in nature. The Fall armyworm, Spodoptera frugiperda, is observed as two sympatric strains that have different host-plant ranges, suggesting the possibility of ecological divergent selection. In our previous study, we observed that these two strains show genetic differentiation across the whole genome with an unprecedentedly low extent, suggesting the possibility that whole genome sequences started to be differentiated between the strains. In this study, we analyzed whole genome sequences from these two strains from Mississippi to identify critical evolutionary factors for genomic differentiation. The genomic Fst is low (0.017) while 91.3% of 10kb windows have Fst greater than 0, suggesting genome-wide differentiation with a low extent. We identified nearly 400 outliers of genetic differentiation between strains, and found that physical linkage among these outliers is not a primary cause of genomic differentiation. Fst is not significantly correlated with gene density, a proxy for the strength of selection, suggesting that a genomic reduction in migration rate dominates the extent of local genetic differentiation. Our analyses reveal that divergent selection alone is sufficient to generate genomic differentiation, and any following diversifying factors may increase the level of genetic differentiation between diverging strains in the process of speciation.

September 21, 2019

From the inside out: An epibiotic Bdellovibrio predator with an expanded genomic complement

Bdellovibrio and like organisms are abundant environmental predators of prokaryotes that show a diversity of predation strategies, ranging from intra-periplasmic to epibiotic predation. The novel epibiotic predator Bdellovibrio qaytius was isolated from a eutrophic freshwater pond in British Columbia, where it was a continual part of the microbial community. Bdellovibrio qaytius was found to preferentially prey on the beta-proteobacterium Paraburkholderia fungorum. Despite its epibiotic replication strategy, B. qaytius encodes a complex genomic complement more similar to periplasmic predators as well as several biosynthesis pathways not previously found in epibiotic predators. Bdellovibrio qaytius is representative of a widely distributed basal cluster within the genus Bdellovibrio, suggesting that epibiotic predation might be a common predation type in nature and ancestral to the genus.

September 21, 2019

Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.

September 21, 2019

Discovery and genotyping of structural variation from long-read haploid genome sequence data.

In an effort to more fully understand the full spectrum of human genetic variation, we generated deep single-molecule, real-time (SMRT) sequencing data from two haploid human genomes. By using an assembly-based approach (SMRT-SV), we systematically assessed each genome independently for structural variants (SVs) and indels resolving the sequence structure of 461,553 genetic variants from 2 bp to 28 kbp in length. We find that >89% of these variants have been missed as part of analysis of the 1000 Genomes Project even after adjusting for more common variants (MAF > 1%). We estimate that this theoretical human diploid differs by as much as ~16 Mbp with respect to the human reference, with long-read sequencing data providing a fivefold increase in sensitivity for genetic variants ranging in size from 7 bp to 1 kbp compared with short-read sequence data. Although a large fraction of genetic variants were not detected by short-read approaches, once the alternate allele is sequence-resolved, we show that 61% of SVs can be genotyped in short-read sequence data sets with high accuracy. Uncoupling discovery from genotyping thus allows for the majority of this missed common variation to be genotyped in the human population. Interestingly, when we repeat SV detection on a pseudodiploid genome constructed in silico by merging the two haploids, we find that ~59% of the heterozygous SVs are no longer detected by SMRT-SV. These results indicate that haploid resolution of long-read sequencing data will significantly increase sensitivity of SV detection.© 2017 Huddleston et al.; Published by Cold Spring Harbor Laboratory Press.

Auto Tag: NCBI

Functional analysis of the first complete genome sequence of a multidrug resistant sequence type 2 Staphylococcus epidermidis.

Multiple genome sequences of important beer-spoiling lactic acid bacteria.

Complete chloroplast genome sequence of the red silk cotton tree (Bombax ceiba)

Comparative genomics of enterohemorrhagic Escherichia coli O145:H28 demonstrates a common evolutionary lineage with Escherichia coli O157:H7.

Whole genome sequence of the soybean aphid, Aphis glycines.

Retrotransposons are the major contributors to the expansion of the Drosophila ananassae Muller F element.

PacBio assembly of a Plasmodium knowlesi genome sequence with Hi-C correction and manual annotation of the SICAvar gene family.

The kinetoplastid-infecting Bodo saltans virus (BsV), a window into the most abundant giant viruses in the sea.

Chromulinavorax destructans, a pathogenic TM6 bacterium with an unusual replication strategy targeting protist mitochondrion

Multi-Locus Variable number of tandem repeat Analysis (MLVA) of Yersinia ruckeri confirms the existence of host-specificity, geographic endemism and anthropogenic dissemination of virulent clones.

Assessing genome assembly quality using the LTR Assembly Index (LAI).

Divergent selection causes whole genome differentiation without physical linkage among the targets in Spodoptera frugiperda (Noctuidae)

From the inside out: An epibiotic Bdellovibrio predator with an expanded genomic complement

Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

Discovery and genotyping of structural variation from long-read haploid genome sequence data.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert