The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (
Efficient crop improvement depends on the application of accurate genetic information contained in diverse germplasm resources. Here we report a reference-grade genome of wild soybean accession W05, with a final assembled genome size of 1013.2?Mb and a contig N50 of 3.3?Mb. The analytical power of the W05 genome is demonstrated by several examples. First, we identify an inversion at the locus determining seed coat color during domestication. Second, a translocation event between chromosomes 11 and 13 of some genotypes is shown to interfere with the assignment of QTLs. Third, we find a region containing copy number variations of the Kunitz…
The soil-borne pathogenic Ralstonia solanacearum species complex (RSSC) is a group of plant pathogens that is economically destructive worldwide and has a broad host range, including various solanaceae plants, banana, ginger, sesame, and clove. Previously, Korean RSSC strains isolated from samples of potato bacterial wilt were grouped into four pathotypes based on virulence tests against potato, tomato, eggplant, and pepper. In this study, we sequenced the genomes of 25 Korean RSSC strains selected based on these pathotypes. The newly sequenced genomes were analyzed to determine the phylogenetic relationships between the strains with average nucleotide identity values, and structurally compared via…
Until recently, the commercial production of Cannabis sativa was restricted to varieties that yielded high-quality fiber while producing low levels of the psychoactive cannabinoid tetrahydrocannabinol (THC). In the last few years, a number of jurisdictions have legalized the production of medical and/or recreational cannabis with higher levels of THC, and other jurisdictions seem poised to follow suit. Consequently, demand for industrial-scale production of high yield cannabis with consistent cannabinoid profiles is expected to increase. In this paper we highlight that currently, projected annual production of cannabis is based largely on facility size, not yield per square meter. This meta-analysis of…
Multispecies host-parasite evolution is common, but how parasites evolve after speciating remains poorly understood. Shared evolutionary history and physiology may propel species along similar evolutionary trajectories whereas pursuing different strategies can reduce competition. We test these scenarios in the economically important association between honey bees and ectoparasitic mites by sequencing the genomes of the sister mite species Varroa destructor and Varroa jacobsoni. These genomes were closely related, with 99.7% sequence identity. Among the 9,628 orthologous genes, 4.8% showed signs of positive selection in at least one species. Divergent selective trajectories were discovered in conserved chemosensory gene families (IGR, SNMP), and…
Echinococcus tapeworms cause a severe helminthic zoonosis called echinococcosis. The genus comprises various species and genotypes, of which E. granulosus (sensu stricto) represents a significant global public health and socioeconomic burden. Mitochondrial (mt) genomes have provided useful genetic markers to explore the nature and extent of genetic diversity within Echinococcus and have underpinned phylogenetic and population structure analyses of this genus. Our recent work indicated a sequence gap (>?1 kb) in the mt genomes of E. granulosus genotype G1, which could not be determined by PCR-based Sanger sequencing. The aim of the present study was to define the complete mt genome,…
Recent adva1nces in whole genome sequencing (WGS) based technologies have facilitated multi-step applications for predicting antimicrobial resistance (AMR) and investigating the molecular epidemiology of Neisseria gonorrhoeae. However, generating full scaffolds of N. gonorrhoeae genomes from short reads, and the assignment of molecular epidemiological information (NG-MLST, NG-MAST, and NG-STAR) to multiple assembled samples, is challenging due to required manual tasks such as annotating antimicrobial resistance determinants with standard nomenclature for a large number of genomes.We present Gen2Epi, a pipeline that assembles short reads into full scaffolds and automatically assigns molecular epidemiological and AMR information to the assembled genomes. Gen2Epi is a…
Pistachio (Pistacia vera), one of the most important commercial nut crops worldwide, is highly adaptable to abiotic stresses and is tolerant to drought and salt stresses.Here, we provide a draft de novo genome of pistachio as well as large-scale genome resequencing. Comparative genomic analyses reveal stress adaptation of pistachio is likely attributable to the expanded cytochrome P450 and chitinase gene families. Particularly, a comparative transcriptomic analysis shows that the jasmonic acid (JA) biosynthetic pathway plays an important role in salt tolerance in pistachio. Moreover, we resequence 93 cultivars and 14 wild P. vera genomes and 35 closely related wild Pistacia…
The recent release of genomic sequences for 3000 rice varieties provides access to the genetic diversity at species level for this crop. We take advantage of this resource to unravel some features of the retrotranspositional landscape of rice. We develop software TRACKPOSON specifically for the detection of transposable elements insertion polymorphisms (TIPs) from large datasets. We apply this tool to 32 families of retrotransposons and identify more than 50,000 TIPs in the 3000 rice genomes. Most polymorphisms are found at very low frequency, suggesting that they may have occurred recently in agro. A genome-wide association study shows that these activations…