In the wake of constant improvements in sequencing technologies, numerous insect genomes have been sequenced. Currently, 1219 insect genome-sequencing projects have been registered with the National Center for Biotechnology Information, including 401 that have genome assemblies and 155 with an official gene set of annotated protein-coding genes. Comparative genomics analysis showed that the expansion or contraction of gene families was associated with well-studied physiological traits such as immune system, metabolic detoxification, parasitism and polyphagy in insects. Here, we summarize the progress of insect genome sequencing, with an emphasis on how this impacts research on pest control. We begin with a brief introduction to the basic concepts of genome assembly, annotation and metrics for evaluating the quality of draft assemblies. We then provide an overview of genome information for numerous insect species, highlighting examples from prominent model organisms, agricultural pests and disease vectors. We also introduce the major insect genome databases. The increasing availability of insect genomic resources is beneficial for developing alternative pest control methods. However, many opportunities remain for developing data-mining tools that make maximal use of the available insect genome resources. Although rapid progress has been achieved, many challenges remain in the field of insect genomics. © 2019 The Royal Entomological Society.
The Genome Sequence of the Halobacterium salinarum Type Strain Is Closely Related to That of Laboratory Strains NRC-1 and R1.
High-coverage long-read sequencing of the Halobacterium salinarum type strain (91-R6) revealed a 2.17-Mb chromosome and two large plasmids (148 and 102 kb). Population heterogeneity and long repeats were observed. Strain 91-R6 and laboratory strain R1 showed 99.63% sequence identity in common chromosomal regions and only 38 strain-specific segments. This information resolves the previously uncertain relationship between type and laboratory strains.Copyright © 2019 Pfeiffer et al.
Plantibacter flavus, Curtobacterium herbarum, Paenibacillus taichungensis, and Rhizobium selenitireducens Endophytes Provide Host-Specific Growth Promotion of Arabidopsis thaliana, Basil, Lettuce, and Bok Choy Plants.
A collection of bacterial endophytes isolated from stem tissues of plants growing in soils highly contaminated with petroleum hydrocarbons were screened for plant growth-promoting capabilities. Twenty-seven endophytic isolates significantly improved the growth of Arabidopsis thaliana plants in comparison to that of uninoculated control plants. The five most beneficial isolates, one strain each of Curtobacterium herbarum, Paenibacillus taichungensis, and Rhizobium selenitireducens and two strains of Plantibacter flavus were further examined for growth promotion in Arabidopsis, lettuce, basil, and bok choy plants. Host-specific plant growth promotion was observed when plants were inoculated with the five bacterial strains. P. flavus strain M251 increased the total biomass and total root length of Arabidopsis plants by 4.7 and 5.8 times, respectively, over that of control plants and improved lettuce and basil root growth, while P. flavus strain M259 promoted Arabidopsis shoot and root growth, lettuce and basil root growth, and bok choy shoot growth. A genome comparison between P. flavus strains M251 and M259 showed that both genomes contain up to 70 actinobacterial putative plant-associated genes and genes involved in known plant-beneficial pathways, such as those for auxin and cytokinin biosynthesis and 1-aminocyclopropane-1-carboxylate deaminase production. This study provides evidence of direct plant growth promotion by Plantibacter flavusIMPORTANCE The discovery of new plant growth-promoting bacteria is necessary for the continued development of biofertilizers, which are environmentally friendly and cost-efficient alternatives to conventional chemical fertilizers. Biofertilizer effects on plant growth can be inconsistent due to the complexity of plant-microbe interactions, as the same bacteria can be beneficial to the growth of some plant species and neutral or detrimental to others. We examined a set of bacterial endophytes isolated from plants growing in a unique petroleum-contaminated environment to discover plant growth-promoting bacteria. We show that strains of Plantibacter flavus exhibit strain-specific plant growth-promoting effects on four different plant species.Copyright © 2019 American Society for Microbiology.
Advantage of the F2:A1:B- IncF Pandemic Plasmid over IncC Plasmids in In Vitro Acquisition and Evolution of blaCTX-M Gene-Bearing Plasmids in Escherichia coli.
Despite a fitness cost imposed on bacterial hosts, large conjugative plasmids play a key role in the diffusion of resistance determinants, such as CTX-M extended-spectrum ß-lactamases. Among the large conjugative plasmids, IncF plasmids are the most predominant group, and an F2:A1:B- IncF-type plasmid encoding a CTX-M-15 variant was recently described as being strongly associated with the emerging worldwide Escherichia coli sequence type 131 (ST131)-O25b:H4 H30Rx/C2 sublineage. In this context, we investigated the fitness cost of narrow-range F-type plasmids, including the F2:A1:B- IncF-type CTX-M-15 plasmid, and of broad-range C-type plasmids in the K-12-like J53-2 E. coli strain. Although all plasmids imposed a significant fitness cost to the bacterial host immediately after conjugation, we show, using an experimental-evolution approach, that a negative impact on the fitness of the host strain was maintained throughout 1,120 generations with the IncC-IncR plasmid, regardless of the presence or absence of cefotaxime, in contrast to the F2:A1:B- IncF plasmid, whose cost was alleviated. Many chromosomal and plasmid rearrangements were detected after conjugation in transconjugants carrying the IncC plasmids but not in transconjugants carrying the F2:A1:B- IncF plasmid, except for insertion sequence (IS) mobilization from the fliM gene leading to the restoration of motility of the recipient strains. Only a few mutations occurred on the chromosome of each transconjugant throughout the experimental-evolution assay. Our findings indicate that the F2:A1:B- IncF CTX-M-15 plasmid is well adapted to the E. coli strain studied, contrary to the IncC-IncR CTX-M-15 plasmid, and that such plasmid-host adaptation could participate in the evolutionary success of the CTX-M-15-producing pandemic E. coli ST131-O25b:H4 lineage.Copyright © 2019 Mahérault et al.
Genomic and transcriptomic characterization of Pseudomonas aeruginosa small colony variants derived from a chronic infection model.
Phenotypic change is a hallmark of bacterial adaptation during chronic infection. In the case of chronic Pseudomonas aeruginosa lung infection in patients with cystic fibrosis, well-characterized phenotypic variants include mucoid and small colony variants (SCVs). It has previously been shown that SCVs can be reproducibly isolated from the murine lung following the establishment of chronic infection with mucoid P. aeruginosa strain NH57388A. Using a combination of single-molecule real-time (PacBio) and Illumina sequencing we identify a large genomic inversion in the SCV through recombination between homologous regions of two rRNA operons and an associated truncation of one of the 16S rRNA genes and suggest this may be the genetic switch for conversion to the SCV phenotype. This phenotypic conversion is associated with large-scale transcriptional changes distributed throughout the genome. This global rewiring of the cellular transcriptomic output results in changes to normally differentially regulated genes that modulate resistance to oxidative stress, central metabolism and virulence. These changes are of clinical relevance because the appearance of SCVs during chronic infection is associated with declining lung function.
Complete Genome Sequence of the Telford Type S Strain of Mycobacterium avium subsp. paratuberculosis
Mycobacterium avium subsp. paratuberculosis is the causative agent of Johnetextquoterights disease (JD). Here, we report the complete genome sequence of Telford 9.2, a well-characterized representative strain of the M. avium subsp. paratuberculosis S subtype that is endemic in New Zealand and Australian sheep.
Construction of chromosome-level assembly is a vital step in achieving the goal of a ‘Platinum’ genome, but it remains a major challenge to assemble and anchor sequences to chromosomes in autopolyploid or highly heterozygous genomes. High-throughput chromosome conformation capture (Hi-C) technology serves as a robust tool to dramatically advance chromosome scaffolding; however, existing approaches are mostly designed for diploid genomes and often with the aim of reconstructing a haploid representation, thereby having limited power to reconstruct chromosomes for autopolyploid genomes. We developed a novel algorithm (ALLHiC) that is capable of building allele-aware, chromosomal-scale assembly for autopolyploid genomes using Hi-C paired-end reads with innovative ‘prune’ and ‘optimize’ steps. Application on simulated data showed that ALLHiC can phase allelic contigs and substantially improve ordering and orientation when compared to other mainstream Hi-C assemblers. We applied ALLHiC on an autotetraploid and an autooctoploid sugar-cane genome and successfully constructed the phased chromosomal-level assemblies, revealing allelic variations present in these two genomes. The ALLHiC pipeline enables de novo chromosome-level assembly of autopolyploid genomes, separating each allele. Haplotype chromosome-level assembly of allopolyploid and heterozygous diploid genomes can be achieved using ALLHiC, overcoming obstacles in assembling complex genomes.
Genome of Crucihimalaya himalaica, a close relative of Arabidopsis, shows ecological adaptation to high altitude.
Crucihimalaya himalaica, a close relative of Arabidopsis and Capsella, grows on the Qinghai-Tibet Plateau (QTP) about 4,000 m above sea level and represents an attractive model system for studying speciation and ecological adaptation in extreme environments. We assembled a draft genome sequence of 234.72 Mb encoding 27,019 genes and investigated its origin and adaptive evolutionary mechanisms. Phylogenomic analyses based on 4,586 single-copy genes revealed that C. himalaica is most closely related to Capsella (estimated divergence 8.8 to 12.2 Mya), whereas both species form a sister clade to Arabidopsis thaliana and Arabidopsis lyrata, from which they diverged between 12.7 and 17.2 Mya. LTR retrotransposons in C. himalaica proliferated shortly after the dramatic uplift and climatic change of the Himalayas from the Late Pliocene to Pleistocene. Compared with closely related species, C. himalaica showed significant contraction and pseudogenization in gene families associated with disease resistance and also significant expansion in gene families associated with ubiquitin-mediated proteolysis and DNA repair. We identified hundreds of genes involved in DNA repair, ubiquitin-mediated proteolysis, and reproductive processes with signs of positive selection. Gene families showing dramatic changes in size and genes showing signs of positive selection are likely candidates for C. himalaica’s adaptation to intense radiation, low temperature, and pathogen-depauperate environments in the QTP. Loss of function at the S-locus, the reason for the transition to self-fertilization of C. himalaica, might have enabled its QTP occupation. Overall, the genome sequence of C. himalaica provides insights into the mechanisms of plant adaptation to extreme environments.Copyright © 2019 the Author(s). Published by PNAS.
The smut fungus Ustilago esculenta has a bipolar mating system with three idiomorphs larger than 500?kb.
Zizania latifolia Turcz., which is mainly distributed in Asia, has had a long cultivation history as a cereal and vegetable crop. On infection with the smut fungus Ustilago esculenta, Z. latifolia becomes an edible vegetable, water bamboo. Two main cultivars, with a green shell and red shell, are cultivated for commercial production in Taiwan. Previous studies indicated that cultivars of Z. latifolia may be related to the infected U. esculenta isolates. However, related research is limited. The infection process of the corn smut fungus Ustilago maydis is coupled with sexual development and under control of the mating type locus. Thus, we aimed to use the knowledge of U. maydis to reveal the mating system of U. esculenta. We collected water bamboo samples and isolated 145 U. esculenta strains from Taiwan’s major production areas. By using PCR and idiomorph screening among meiotic offspring and field isolates, we identified three idiomorphs of the mating type locus and found no sequence recombination between them. Whole-genome sequencing (Illumina and PacBio) suggested that the mating system of U. esculenta was bipolar. Mating type locus 1 (MAT-1) was 552,895?bp and contained 44% repeated sequences. Sequence comparison revealed that U. esculenta MAT-1 shared high gene synteny with Sporisorium reilianum and many repeats with Ustilago hordei MAT-1. These results can be utilized to further explore the genomic diversity of U. esculenta isolates and their application for water bamboo breeding. Copyright © 2019 Elsevier Inc. All rights reserved.
Characterization of the genome of a Nocardia strain isolated from soils in the Qinghai-Tibetan Plateau that specifically degrades crude oil and of this biodegradation.
A strain of Nocardia isolated from crude oil-contaminated soils in the Qinghai-Tibetan Plateau degrades nearly all components of crude oil. This strain was identified as Nocardia soli Y48, and its growth conditions were determined. Complete genome sequencing showed that N. soli Y48 has a 7.3?Mb genome and many genes responsible for hydrocarbon degradation, biosurfactant synthesis, emulsification and other hydrocarbon degradation-related metabolisms. Analysis of the clusters of orthologous groups (COGs) and genomic islands (GIs) revealed that Y48 has undergone significant gene transfer events to adapt to changing environmental conditions (crude oil contamination). The structural features of the genome might provide a competitive edge for the survival of N. soli Y48 in oil-polluted environments and reflect the adaptation of coexisting bacteria to distinct nutritional niches.Copyright © 2018. Published by Elsevier Inc.
Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences.We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover’s distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover’s distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours.The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.
Chromosome-level genome assembly of Triplophysa tibetana, a fish adapted to the harsh high-altitude environment of the Tibetan Plateau.
Triplophysa is an endemic fish genus of the Tibetan Plateau in China. Triplophysa tibetana, which lives at a recorded altitude of ~4,000 m and plays an important role in the highland aquatic ecosystem, serves as an excellent model for investigating high-altitude environmental adaptation. However, evolutionary and conservation studies of T. tibetana have been limited by scarce genomic resources for the genus Triplophysa. In the present study, we applied PacBio sequencing and the Hi-C technique to assemble the T. tibetana genome. A 652-Mb genome with 1,325 contigs with an N50 length of 3.1 Mb was obtained. The 1,137 contigs were further assembled into 25 chromosomes, representing 98.7% and 80.47% of all contigs at the base and sequence number level, respectively. Approximately 260 Mb of sequence, accounting for ~39.8% of the genome, was identified as repetitive elements. DNA transposons (16.3%), long interspersed nuclear elements (12.4%) and long terminal repeats (11.0%) were the most repetitive types. In total, 24,372 protein-coding genes were predicted in the genome, and ~95% of the genes were functionally annotated via a search in public databases. Using whole genome sequence information, we found that T. tibetana diverged from its common ancestor with Danio rerio ~121.4 million years ago. The high-quality genome assembled in this work not only provides a valuable genomic resource for future population and conservation studies of T. tibetana, but it also lays a solid foundation for further investigation into the mechanisms of environmental adaptation of endemic fishes in the Tibetan Plateau. © 2019 John Wiley & Sons Ltd.
SMRT sequencing reveals differential patterns of methylation in two O111:H- STEC isolates from a hemolytic uremic syndrome outbreak in Australia.
In 1995 a severe haemolytic-uremic syndrome (HUS) outbreak in Adelaide occurred. A recent genomic analysis of Shiga toxigenic Escherichia coli (STEC) O111:H- strains 95JB1 and 95NR1 from this outbreak found that the more virulent isolate, 95NR1, harboured two additional copies of the Shiga toxin 2 (Stx2) genes encoded within prophage regions. The structure of the Stx2-converting prophages could not be fully resolved using short-read sequence data alone and it was not clear if there were other genomic differences between 95JB1 and 95NR1. In this study we have used Pacific Biosciences (PacBio) single molecule real-time (SMRT) sequencing to characterise the genome and methylome of 95JB1 and 95NR1. We completely resolved the structure of all prophages including two, tandemly inserted, Stx2-converting prophages in 95NR1 that were absent from 95JB1. Furthermore we defined all insertion sequences and found an additional IS1203 element in the chromosome of 95JB1. Our analysis of the methylome of 95NR1 and 95JB1 identified hemi-methylation of a novel motif (5′-CTGCm6AG-3′) in more than 4000 sites in the 95NR1 genome. These sites were entirely unmethylated in the 95JB1 genome, and included at least 177 potential promoter regions that could contribute to regulatory differences between the strains. IS1203 mediated deactivation of a novel type IIG methyltransferase in 95JB1 is the likely cause of the observed differential patterns of methylation between 95NR1 and 95JB1. This study demonstrates the capability of PacBio SMRT sequencing to resolve complex prophage regions and reveal the genetic and epigenetic heterogeneity within a clonal population of bacteria.
Complete Genome Sequence of Sequevar 14M Ralstonia solanacearum Strain HA4-1 Reveals Novel Type III Effectors Acquired Through Horizontal Gene Transfer.
Ralstonia solanacearum, which causes bacterial wilt in a broad range of plants, is considered a “species complex” due to its significant genetic diversity. Recently, we have isolated a new R. solanacearum strain HA4-1 from Hong’an county in Hubei province of China and identified it being phylotype I, sequevar 14M (phylotype I-14M). Interestingly, we found that it can cause various disease symptoms among different potato genotypes and display different pathogenic behavior compared to a phylogenetically related strain, GMI1000. To dissect the pathogenic mechanisms of HA4-1, we sequenced its whole genome by combined sequencing technologies including Illumina HiSeq2000, PacBio RS II, and BAC-end sequencing. Genome assembly results revealed the presence of a conventional chromosome, a megaplasmid as well as a 143 kb plasmid in HA4-1. Comparative genome analysis between HA4-1 and GMI1000 shows high conservation of the general virulence factors such as secretion systems, motility, exopolysaccharides (EPS), and key regulatory factors, but significant variation in the repertoire and structure of type III effectors, which could be the determinants of their differential pathogenesis in certain potato species or genotypes. We have identified two novel type III effectors that were probably acquired through horizontal gene transfer (HGT). These novel R. solanacearum effectors display homology to several YopJ and XopAC family members. We named them as RipBR and RipBS. Notably, the copy of RipBR on the plasmid is a pseudogene, while the other on the megaplasmid is normal. For RipBS, there are three copies located in the megaplasmid and plasmid, respectively. Our results have not only enriched the genome information on R. solanacearum species complex by sequencing the first sequevar 14M strain and the largest plasmid reported in R. solanacearum to date but also revealed the variation in the repertoire of type III effectors. This will greatly contribute to the future studies on the pathogenic evolution, host adaptation, and interaction between R. solanacearum and potato.
Methicillin-Resistant Staphylococcus aureus Blood Isolates Harboring a Novel Pseudo-staphylococcal Cassette Chromosome mec Element.
The aim of this work was to assess a novel pseudo-staphylococcal cassette chromosome mec (?SCCmec) element in methicillin-resistant Staphylococcus aureus (MRSA) blood isolates. Community-associated MRSA E16SA093 and healthcare-associated MRSA F17SA003 isolates were recovered from the blood specimens of patients with S. aureus bacteremia in 2016 and in 2017, respectively. Antimicrobial susceptibility was determined via the disk diffusion method, and SCCmec typing was conducted by multiplex polymerase chain reaction. Whole genome sequencing was carried out by single molecule real-time long-read sequencing. Both isolates belonged to sequence type 72 and agr-type I, and they were negative for Panton-Valentine leukocidin and toxic shock syndrome toxin. The spa-types of E16SA093 and F17SA003 were t324 and t2460, respectively. They had a SCCmec IV-like element devoid of the cassette chromosome recombinase (ccr) gene complex, designated as ?SCCmecE16SA093. The element was manufactured from SCCmec type IV and the deletion of the ccr gene complex and a 7.0- and 31.9-kb portion of each chromosome. The deficiency of the ccr gene complex in the SCCmec unit is likely resulting in mobility loss, which would be an adaptive evolutionary mechanism. The dissemination of this clone should be monitored closely.