Menu
July 7, 2019

Hybrid assembly with long and short reads improves discovery of gene family expansions.

Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation.We developed a hybrid assembly pipeline called “Alpaca” that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation.Compared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and predicted tandemly repeated genes absent from the other assemblies.Our results suggest Alpaca is a useful tool for investigating structural and copy number variation within de novo assemblies of sampled populations.


July 7, 2019

Whole genome sequencing and analysis of Campylobacter coli YH502 from retail chicken reveals a plasmid-borne type VI secretion system.

Campylobacter is a major cause of foodborne illnesses worldwide. Campylobacter infections, commonly caused by ingestion of undercooked poultry and meat products, can lead to gastroenteritis and chronic reactive arthritis in humans. Whole genome sequencing (WGS) is a powerful technology that provides comprehensive genetic information about bacteria and is increasingly being applied to study foodborne pathogens: e.g., evolution, epidemiology/outbreak investigation, and detection. Herein we report the complete genome sequence of Campylobacter coli strain YH502 isolated from retail chicken in the United States. WGS, de novo assembly, and annotation of the genome revealed a chromosome of 1,718,974 bp and a mega-plasmid (pCOS502) of 125,964 bp. GC content of the genome was 31.2% with 1931 coding sequences and 53 non-coding RNAs. Multiple virulence factors including a plasmid-borne type VI secretion system and antimicrobial resistance genes (beta-lactams, fluoroquinolones, and aminoglycoside) were found. The presence of T6SS in a mobile genetic element (plasmid) suggests plausible horizontal transfer of these virulence genes to other organisms. The C. coli YH502 genome also harbors CRISPR sequences and associated proteins. Phylogenetic analysis based on average nucleotide identity and single nucleotide polymorphisms identified closely related C. coli genomes available in the NCBI database. Taken together, the analyzed genomic data of this potentially virulent strain of C. coli will facilitate further understanding of this important foodborne pathogen most likely leading to better control strategies. The chromosome and plasmid sequences of C. coli YH502 have been deposited in GenBank under the accession numbers CP018900.1 and CP018901.1, respectively.


July 7, 2019

Complete genome of a metabolically-diverse marine bacterium Shewanella japonica KCTC 22435T.

Shewanella japonica KCTC 22435Tis a facultatively anaerobic, Gram-negative, mesophilic, rod-shaped bacterium isolated from sea water at the Pacific Institute of Bio-organic Chemistry of the Marine Experimental Station, Troitza Bay, Gulf of Peter the Great, Russia. Here, we report the complete genome of S. japonica KCTC 22435T, which consists of 4,975,677bp (G+C content of 40.80%) with a single chromosome, 4036 protein-coding genes, 97 tRNAs and 8 rRNA operons. Genes detected in the genome reveal that the strain possesses a type II secretion system, cytochrome c family proteins with various numbers of heme-binding motifs, and metabolic pathways for utilizing diverse carbon sources, supporting the potential of KCTC 22435Tto generate electricity in salinity culture conditions. Copyright © 2017 Elsevier B.V. All rights reserved.


July 7, 2019

Comparative genomics of all three Campylobacter sputorum biovars and a novel cattle-associated C. sputorum clade.

Campylobacter sputorum is a non-thermotolerant campylobacter that is primarily isolated from food animals such as cattle and sheep. C. sputorum is also infrequently associated with human illness. Based on catalase and urease activity, three biovars are currently recognized within C. sputorum: bv. sputorum (catalase negative, urease negative), bv. fecalis (catalase positive, urease negative), and bv. paraureolyticus (catalase negative, urease positive). A multi-locus sequence typing (MLST) method was recently constructed for C. sputorum. MLST typing of several cattle-associated C. sputorum isolates suggested that they are members of a divergent C. sputorum clade. Although catalase positive, and thus technically bv. fecalis, the taxonomic position of these strains could not be determined solely by MLST. To further characterize C. sputorum, the genomes of four strains, representing all three biovars and the divergent clade, were sequenced to completion. Here we present a comparative genomic analysis of the four C. sputorum genomes. This analysis indicates that the three biovars and the cattle-associated strains are highly-related at the genome level with similarities in gene content. Furthermore, the four genomes are strongly syntenic with one or two minor inversions. However, substantial differences in gene content were observed among the three biovars. Finally, although the strain representing the cattle-associated isolates was shown to be C. sputorum, it is possible that this strain is a member of a novel C. sputorum subspecies; thus, these cattle-associated strains may form a second taxon within C. sputorum. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.


July 7, 2019

Whole genome sequencing predicts novel human disease models in rhesus macaques.

Rhesus macaques are an important pre-clinical model of human disease. To advance our understanding of genomic variation that may influence disease, we surveyed genome-wide variation in 21 rhesus macaques. We employed best-practice variant calling, validated with Mendelian inheritance. Next, we used alignment data from our cohort to detect genomic regions likely to produce inaccurate genotypes, potentially due to either gene duplication or structural variation between individuals. We generated a final dataset of >16 million high confidence variants, including 13 million in Chinese-origin rhesus macaques, an increasingly important disease model. We detected an average of 131 mutations predicted to severely alter protein coding per animal, and identified 45 such variants that coincide with known pathogenic human variants. These data suggest that expanded screening of existing breeding colonies will identify novel models of human disease, and that increased genomic characterization can help inform research studies in macaques. Copyright © 2017 Elsevier Inc. All rights reserved.


July 7, 2019

Analysis of the genome and mobilome of a dissimilatory arsenate reducing Aeromonas sp. O23A reveals multiple mechanisms for heavy metal resistance and metabolism.

Aeromonas spp. are among the most ubiquitous microorganisms, as they have been isolated from different environmental niches including waters, soil, as well as wounds and digestive tracts of poikilothermic animals and humans. Although much attention has been paid to the pathogenicity of Aeromonads, the role of these bacteria in environmentally important processes, such as transformation of heavy metals, remains to be discovered. Therefore, the aim of this study was a detailed genomic characterization of Aeromonas sp. O23A, the first representative of this genus capable of dissimilatory arsenate reduction. The strain was isolated from microbial mats from the Zloty Stok mine (SW Poland), an environment strongly contaminated with arsenic. Previous physiological studies indicated that O23A may be involved in both mobilization and immobilization of this metalloid in the environment. To discover the molecular basis of the mechanisms behind the observed abilities, the genome of O23A (~5.0 Mbp) was sequenced and annotated, and genes for arsenic respiration, heavy metal resistance (hmr) and other phenotypic traits, including siderophore production, were identified. The functionality of the indicated gene modules was assessed in a series of minimal inhibitory concentration analyses for various metals and metalloids, as well as mineral dissolution experiments. Interestingly, comparative analyses revealed that O23A is related to a fish pathogen Aeromonas salmonicida subsp. salmonicida A449 which, however, does not carry genes for arsenic respiration. This indicates that the dissimilatory arsenate reduction ability may have been lost during genome reduction in pathogenic strains, or acquired through horizontal gene transfer. Therefore, particular emphasis was placed upon the mobilome of O23A, consisting of four plasmids, a phage, and numerous transposable elements, which may play a role in the dissemination of hmr and arsenic metabolism genes in the environment. The obtained results indicate that Aeromonas sp. O23A is well-adapted to the extreme environmental conditions occurring in the Zloty Stok mine. The analysis of genome encoded traits allowed for a better understanding of the mechanisms of adaptation of the strain, also with respect to its presumable role in colonization and remediation of arsenic-contaminated waters, which may never have been discovered based on physiological analyses alone.


July 7, 2019

Phylogenomic analysis supports multiple instances of polyphyly in the oomycete peronosporalean lineage.

The study of biological diversification of oomycetes has been a difficult task for more than a century. Pioneer researchers used morphological characters to describe this heterogeneous group, and physiological and genetic tools expanded knowledge of these microorganisms. However, research on oomycete diversification is limited by conflicting phylogenies. Using whole genomic data from 17 oomycete taxa, we obtained a dataset of 277 core orthologous genes shared among these genomes. Analyses of this dataset resulted in highly congruent and strongly supported estimates of oomycete phylogeny when we used concatenated maximum likelihood and coalescent-based methods; the one important exception was the position of Albugo. Our results supported the position of Phytopythium vexans (formerly in Pythium clade K) as a sister clade to the Phytophthora-Hyaloperonospora clade. The remaining clades comprising Pythium sensu lato formed two monophyletic groups. One group was composed of three taxa that correspond to Pythium clades A, B and C, and the other group contained taxa representing clades F, G and I, in agreement with previous Pythium phylogenies. However, the group containing Pythium clades F, G and I was placed as sister to the Phytophthora-Hyaloperonospora-Phytopythium clade, thus confirming the lack of monophyly of Pythium sensu lato. Multispecies coalescent methods revealed that the white blister rust, Albugo laibachii, could not be placed with a high degree of confidence. Our analyses show that genomic data can resolve the oomycete phylogeny and provide a phylogenetic framework to study the evolution of oomycete lifestyles. Copyright © 2017 Elsevier Inc. All rights reserved.


July 7, 2019

Chromosome-level genome assembly and transcriptome of the green alga Chromochloris zofingiensis illuminates astaxanthin production.

Microalgae have potential to help meet energy and food demands without exacerbating environmental problems. There is interest in the unicellular green alga Chromochloris zofingiensis, because it produces lipids for biofuels and a highly valuable carotenoid nutraceutical, astaxanthin. To advance understanding of its biology and facilitate commercial development, we present a C. zofingiensis chromosome-level nuclear genome, organelle genomes, and transcriptome from diverse growth conditions. The assembly, derived from a combination of short- and long-read sequencing in conjunction with optical mapping, revealed a compact genome of ~58 Mbp distributed over 19 chromosomes containing 15,274 predicted protein-coding genes. The genome has uniform gene density over chromosomes, low repetitive sequence content (~6%), and a high fraction of protein-coding sequence (~39%) with relatively long coding exons and few coding introns. Functional annotation of gene models identified orthologous families for the majority (~73%) of genes. Synteny analysis uncovered localized but scrambled blocks of genes in putative orthologous relationships with other green algae. Two genes encoding beta-ketolase (BKT), the key enzyme synthesizing astaxanthin, were found in the genome, and both were up-regulated by high light. Isolation and molecular analysis of astaxanthin-deficient mutants showed that BKT1 is required for the production of astaxanthin. Moreover, the transcriptome under high light exposure revealed candidate genes that could be involved in critical yet missing steps of astaxanthin biosynthesis, including ABC transporters, cytochrome P450 enzymes, and an acyltransferase. The high-quality genome and transcriptome provide insight into the green algal lineage and carotenoid production.


July 7, 2019

Sequencing a piece of history: complete genome sequence of the original Escherichia coli strain.

In 1885, Theodor Escherich first described the Bacillus coli commune, which was subsequently renamed Escherichia coli. We report the complete genome sequence of this original strain (NCTC 86). The 5?144?392?bp circular chromosome encodes the genes for 4805 proteins, which include antigens, virulence factors, antimicrobial-resistance factors and secretion systems, of a commensal organism from the pre-antibiotic era. It is located in the E. coli A subgroup and is closely related to E. coli K-12 MG1655. E. coli strain NCTC 86 and the non-pathogenic K-12, C, B and HS strains share a common backbone that is largely co-linear. The exception is a large 2?803?932?bp inversion that spans the replication terminus from gmhB to clpB. Comparison with E. coli K-12 reveals 41 regions of difference (577?351?bp) distributed across the chromosome. For example, and contrary to current dogma, E. coli NCTC 86 includes a nine gene sil locus that encodes a silver-resistance efflux pump acquired before the current widespread use of silver nanoparticles as an antibacterial agent, possibly resulting from the widespread use of silver utensils and currency in Germany in the 1800s. In summary, phylogenetic comparisons with other E. coli strains confirmed that the original strain isolated by Escherich is most closely related to the non-pathogenic commensal strains. It is more distant from the root than the pathogenic organisms E. coli 042 and O157?:?H7; therefore, it is not an ancestral state for the species.


July 7, 2019

Antibody-independent mechanisms regulate the establishment of chronic Plasmodium infection.

Malaria is caused by parasites of the genus Plasmodium. All human-infecting Plasmodium species can establish long-lasting chronic infections(1-5), creating an infectious reservoir to sustain transmission(1,6). It is widely accepted that the maintenance of chronic infection involves evasion of adaptive immunity by antigenic variation(7). However, genes involved in this process have been identified in only two of five human-infecting species: Plasmodium falciparum and Plasmodium knowlesi. Furthermore, little is understood about the early events in the establishment of chronic infection in these species. Using a rodent model we demonstrate that from the infecting population, only a minority of parasites, expressing one of several clusters of virulence-associated pir genes, establishes a chronic infection. This process occurs in different species of parasites and in different hosts. Establishment of chronicity is independent of adaptive immunity and therefore different from the mechanism proposed for maintenance of chronic P. falciparum infections(7-9). Furthermore, we show that the proportions of parasites expressing different types of pir genes regulate the time taken to establish a chronic infection. Because pir genes are common to most, if not all, species of Plasmodium(10), this process may be a common way of regulating the establishment of chronic infections.


July 7, 2019

Adaptation of genetically monomorphic bacteria: evolution of copper resistance through multiple horizontal gene transfers of complex and versatile mobile genetic elements.

Copper-based antimicrobial compounds are widely used to control plant bacterial pathogens. Pathogens have adapted in response to this selective pressure. Xanthomonas citri pv. citri, a major citrus pathogen causing Asiatic citrus canker, was first reported to carry plasmid-encoded copper resistance in Argentina. This phenotype was conferred by the copLAB gene system. The emergence of resistant strains has since been reported in Réunion and Martinique. Using microsatellite-based genotyping and copLAB PCR, we demonstrated that the genetic structure of the copper-resistant strains from these three regions was made up of two distant clusters and varied for the detection of copLAB amplicons. In order to investigate this pattern more closely, we sequenced six copper-resistant X. citri pv. citri strains from Argentina, Martinique and Réunion, together with reference copper-resistant Xanthomonas and Stenotrophomonas strains using long-read sequencing technology. Genes involved in copper resistance were found to be strain dependent with the novel identification in X. citri pv. citri of copABCD and a cus heavy metal efflux resistance-nodulation-division system. The genes providing the adaptive trait were part of a mobile genetic element similar to Tn3-like transposons and included in a conjugative plasmid. This indicates the system’s great versatility. The mining of all available bacterial genomes suggested that, within the bacterial community, the spread of copper resistance associated with mobile elements and their plasmid environments was primarily restricted to the Xanthomonadaceae family.© 2017 John Wiley & Sons Ltd.


July 7, 2019

Euglena gracilis genome and transcriptome: organelles, nuclear genome assembly strategies and initial features.

Euglena gracilis is a major component of the aquatic ecosystem and together with closely related species, is ubiquitous worldwide. Euglenoids are an important group of protists, possessing a secondarily acquired plastid and are relatives to the Kinetoplastidae, which themselves have global impact as disease agents. To understand the biology of E. gracilis, as well as to provide further insight into the evolution and origins of the Kinetoplastidae, we embarked on sequencing the nuclear genome; the plastid and mitochondrial genomes are already in the public domain. Earlier studies suggested an extensive nuclear DNA content, with likely a high degree of repetitive sequence, together with significant extrachromosomal elements. To produce a list of coding sequences we have combined transcriptome data from both published and new sources, as well as embarked on de novo sequencing using a combination of 454, Illumina paired end libraries and long PacBio reads. Preliminary analysis suggests a surprisingly large genome approaching 2 Gbp, with a highly fragmented architecture and extensive repeat composition. Over 80% of the RNAseq reads from E. gracilis maps to the assembled genome sequence, which is comparable with the well assembled genomes of T. brucei and T. cruzi. In order to achieve this level of assembly we employed multiple informatics pipelines, which are discussed here. Finally, as a preliminary view of the genome architecture, we discuss the tubulin and calmodulin genes, which highlight potential novel splicing mechanisms.


July 7, 2019

Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.

Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner.Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly.Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies.


July 7, 2019

Draft genome sequence of Grammothele lineata SDL-CO-2015-1, a jute endophyte with a potential for paclitaxel biosynthesis.

Grammothele lineata strain SDL-CO-2015-1, a basidiomycete fungus, was identified as an endophyte from a jute species, Corchorus olitorius var. 2015, and found to produce paclitaxel, a diterpenic polyoxygenated pseudoalkaloid with antitumor activity. Here, we report the draft genome sequence (42.8 Mb with 9,395 genes) of this strain. Copyright © 2017 Das et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.