In the wake of constant improvements in sequencing technologies, numerous insect genomes have been sequenced. Currently, 1219 insect genome-sequencing projects have been registered with the National Center for Biotechnology Information, including 401 that have genome assemblies and 155 with an official gene set of annotated protein-coding genes. Comparative genomics analysis showed that the expansion or contraction of gene families was associated with well-studied physiological traits such as immune system, metabolic detoxification, parasitism and polyphagy in insects. Here, we summarize the progress of insect genome sequencing, with an emphasis on how this impacts research on pest control. We begin with a brief introduction to the basic concepts of genome assembly, annotation and metrics for evaluating the quality of draft assemblies. We then provide an overview of genome information for numerous insect species, highlighting examples from prominent model organisms, agricultural pests and disease vectors. We also introduce the major insect genome databases. The increasing availability of insect genomic resources is beneficial for developing alternative pest control methods. However, many opportunities remain for developing data-mining tools that make maximal use of the available insect genome resources. Although rapid progress has been achieved, many challenges remain in the field of insect genomics. © 2019 The Royal Entomological Society.
Brassica napus (AACC, 2n = 38) is an important oilseed crop grown worldwide. However, little is known about the population evolution of this species, the genomic difference between its major genetic groups, such as European and Asian rapeseed, and the impacts of historical large-scale introgression events on this young tetraploid. In this study, we reported the de novo assembly of the genome sequences of an Asian rapeseed (B. napus), Ningyou 7, and its four progenitors and compared these genomes with other available genomic data from diverse European and Asian cultivars. Our results showed that Asian rapeseed originally derived from European rapeseed but subsequently significantly diverged, with rapid genome differentiation after hybridization and intensive local selective breeding. The first historical introgression of B. rapa dramatically broadened the allelic pool but decreased the deleterious variations of Asian rapeseed. The second historical introgression of the double-low traits of European rapeseed (canola) has reshaped Asian rapeseed into two groups (double-low and double-high), accompanied by an increase in genetic load in the double-low group. This study demonstrates distinctive genomic footprints and deleterious SNP (single nucleotide polymorphism) variants for local adaptation by recent intra- and interspecies introgression events and provides novel insights for understanding the rapid genome evolution of a young allopolyploid crop. © 2019 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Expedited assessment of terrestrial arthropod diversity by coupling Malaise traps with DNA barcoding 1.
Monitoring changes in terrestrial arthropod communities over space and time requires a dramatic increase in the speed and accuracy of processing samples that cannot be achieved with morphological approaches. The combination of DNA barcoding and Malaise traps allows expedited, comprehensive inventories of species abundance whose cost will rapidly decline as high-throughput sequencing technologies advance. Aside from detailing protocols from specimen sorting to data release, this paper describes their use in a survey of arthropod diversity in a national park that examined 21?194 specimens representing 2255 species. These protocols can support arthropod monitoring programs at regional, national, and continental scales.
Nodule bacteria from the cultured legume Phaseolus dumosus (belonging to the Phaseolus vulgaris cross-inoculation group) with common tropici phenotypic characteristics and symbiovar but distinctive phylogenomic position and chromid.
Phaseolus dumosus is an endemic species from mountain tops in Mexico that was found in traditional agriculture areas in Veracruz, Mexico. P. dumosus plants were identified by ITS sequences and their nodules were collected from agricultural fields or from trap plant experiments in the laboratory. Bacteria from P. dumosus nodules were identified as belonging to the phaseoli-etli-leguminosarum (PEL) or to the tropici group by 16S rRNA gene sequences. We obtained complete closed genomes from two P. dumosus isolates CCGE531 and CCGE532 that were phylogenetically placed within the tropici group but with a distinctive phylogenomic position and low average nucleotide identity (ANI). CCGE531 and CCGE532 had common phenotypic characteristics with tropici type B rhizobial symbionts. Genome synteny analysis and ANI showed that P. dumosus isolates had different chromids and our analysis suggests that chromids have independently evolved in different lineages of the Rhizobium genus. Finally, we considered that P. dumosus and Phaseolus vulgaris plants belong to the same cross-inoculation group since they have conserved symbiotic affinites for rhizobia.Copyright © 2018 Elsevier GmbH. All rights reserved.
Insight into the microbial world of Bemisia tabaci cryptic species complex and its relationships with its host.
The 37 currently recognized Bemisia tabaci cryptic species are economically important species and contain both primary and secondary endosymbionts, but their diversity has never been mapped systematically across the group. To achieve this, PacBio sequencing of full-length bacterial 16S rRNA gene amplicons was carried out on 21 globally collected species in the B. tabaci complex, and two samples from B. afer were used here as outgroups. The microbial diversity was first explored across the major lineages of the whole group and 15 new putative bacterial sequences were observed. Extensive comparison of our results with previous endosymbiont diversity surveys which used PCR or multiplex 454 pyrosequencing platforms showed that the bacterial diversity was underestimated. To validate these new putative bacteria, one of them (Halomonas) was first confirmed to be present in MED B. tabaci using Hiseq2500 and FISH technologies. These results confirmed PacBio is a reliable and informative venue to reveal the bacterial diversity of insects. In addition, many new secondary endosymbiotic strains of Rickettsia and Arsenophonus were found, increasing the known diversity in these groups. For the previously described primary endosymbionts, one Portiera Operational Taxonomic Units (OTU) was shared by all B. tabaci species. The congruence of the B. tabaci-host and Portiera phylogenetic trees provides strong support for the hypothesis that primary endosymbionts co-speciated with their hosts. Likewise, a comparison of bacterial alpha diversities, Principal Coordinate Analysis, indistinct endosymbiotic communities harbored by different species and the co-divergence analyses suggest a lack of association between overall microbial diversity with cryptic species, further indicate that the secondary endosymbiont-mediated speciation is unlikely to have occurred in the B. tabaci species group.
The ability to generate long sequencing reads and access long-range linkage information is revolutionizing the quality and completeness of genome assemblies. Here we use a hybrid approach that combines data from four genome sequencing and mapping technologies to generate a new genome assembly of the honeybee Apis mellifera. We first generated contigs based on PacBio sequencing libraries, which were then merged with linked-read 10x Chromium data followed by scaffolding using a BioNano optical genome map and a Hi-C chromatin interaction map, complemented by a genetic linkage map.Each of the assembly steps reduced the number of gaps and incorporated a substantial amount of additional sequence into scaffolds. The new assembly (Amel_HAv3) is significantly more contiguous and complete than the previous one (Amel_4.5), based mainly on Sanger sequencing reads. N50 of contigs is 120-fold higher (5.381 Mbp compared to 0.053 Mbp) and we anchor >?98% of the sequence to chromosomes. All of the 16 chromosomes are represented as single scaffolds with an average of three sequence gaps per chromosome. The improvements are largely due to the inclusion of repetitive sequence that was unplaced in previous assemblies. In particular, our assembly is highly contiguous across centromeres and telomeres and includes hundreds of AvaI and AluI repeats associated with these features.The improved assembly will be of utility for refining gene models, studying genome function, mapping functional genetic variation, identification of structural variants, and comparative genomics.
Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required.We describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series, and differential abundance studies, includes real and simulated strain-level diversity, and generates second- and third-generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes, we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT, and metaSPAdes, on several thousand small data sets generated with CAMISIM.CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with standards of truth for method evaluation. All data sets and the software are freely available at https://github.com/CAMI-challenge/CAMISIM.