Menu
April 21, 2020

Discovery of tandem and interspersed segmental duplications using high-throughput sequencing.

Several algorithms have been developed that use high-throughput sequencing technology to characterize structural variations (SVs). Most of the existing approaches focus on detecting relatively simple types of SVs such as insertions, deletions and short inversions. In fact, complex SVs are of crucial importance and several have been associated with genomic disorders. To better understand the contribution of complex SVs to human disease, we need new algorithms to accurately discover and genotype such variants. Additionally, due to similar sequencing signatures, inverted duplications or gene conversion events that include inverted segmental duplications are often characterized as simple inversions, likewise, duplications and gene conversions in direct orientation may be called as simple deletions. Therefore, there is still a need for accurate algorithms to fully characterize complex SVs and thus improve calling accuracy of more simple variants.We developed novel algorithms to accurately characterize tandem, direct and inverted interspersed segmental duplications using short read whole genome sequencing datasets. We integrated these methods to our TARDIS tool, which is now capable of detecting various types of SVs using multiple sequence signatures such as read pair, read depth and split read. We evaluated the prediction performance of our algorithms through several experiments using both simulated and real datasets. In the simulation experiments, using a 30× coverage TARDIS achieved 96% sensitivity with only 4% false discovery rate. For experiments that involve real data, we used two haploid genomes (CHM1 and CHM13) and one human genome (NA12878) from the Illumina Platinum Genomes set. Comparison of our results with orthogonal PacBio call sets from the same genomes revealed higher accuracy for TARDIS than state-of-the-art methods. Furthermore, we showed a surprisingly low false discovery rate of our approach for discovery of tandem, direct and inverted interspersed segmental duplications prediction on CHM1 (<5% for the top 50 predictions).TARDIS source code is available at https://github.com/BilkentCompGen/tardis, and a corresponding Docker image is available at https://hub.docker.com/r/alkanlab/tardis/.Supplementary data are available at Bioinformatics online. © The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


April 21, 2020

Mediterraneibacter butyricigenes sp. nov., a butyrate-producing bacterium isolated from human faeces.

A Gram-stain-positive, obligately anaerobic, non-motile, nonspore-forming, and rod-shaped bacterial strain, designated KGMB01110T, was isolated from a faecal sample of a healthy male in South Korea. Phylogenetic analysis based on 16S rRNA gene showed that strain KGMB01110T belonged to Clostridium cluster XIVa and was most closely related to Mediterraneibacter glycyrrhizinilyticus KCTC 5760T (95.9% 16S rRNA gene sequence similarity). The DNA G + C content of strain KGMB01110T based on its whole genome sequence was 44.1 mol%. The major cellular fatty acids (> 10%) of the isolate were C14:0 and C16:0. The strain KGMB01110T was positive for arginine dihydrolase, ß-galactosidase-6-phosphatase, and alkaline phosphatase. The strain KGMB01110T also produced acid from D-glucose and D-rhamnose, and hydrolyzed gelatin and aesculin. Furthermore, HPLC analysis and UV-tests of culture supernatant revealed that the strain KGMB01110T produced butyrate as the major end product of glucose fermentation. Based on the phylogenetic and phenotypic characteristics, strain KGMB01110T represent a novel species of the genus Mediterraneibacter in the family Lachnospiraceae. The type strain is KGMB01110T (= KCTC 15684T = CCUG 72830T).


April 21, 2020

Neopinone isomerase is involved in codeine and morphine biosynthesis in opium poppy.

The isomerization of neopinone to codeinone is a critical step in the biosynthesis of opiate alkaloids in opium poppy. Previously assumed to be spontaneous, the process is in fact catalyzed enzymatically by neopinone isomerase (NISO). Without NISO the primary metabolic products in the plant, in engineered microbes and in vitro are neopine and neomorphine, which are structural isomers of codeine and morphine, respectively. Inclusion of NISO in yeast strains engineered to convert thebaine to natural or semisynthetic opiates dramatically enhances formation of the desired products at the expense of neopine and neomorphine accumulation. Along with thebaine synthase, NISO is the second member of the pathogenesis-related 10 (PR10) protein family recently implicated in the enzymatic catalysis of a presumed spontaneous conversion in morphine biosynthesis.


April 21, 2020

Novel trimethoprim resistance gene dfrA34 identified in Salmonella Heidelberg in the USA.

Trimethoprim/sulfamethoxazole is a synthetic antibiotic combination recommended for the treatment of complicated non-typhoidal Salmonella infections in humans. Resistance to trimethoprim/sulfamethoxazole is mediated by the acquisition of mobile genes, requiring both a dfr gene (trimethoprim resistance) and a sul gene (sulfamethoxazole resistance) for a clinical resistance phenotype (MIC =4/76?mg/L). In 2017, the CDC investigated a multistate outbreak caused by a Salmonella enterica serotype Heidelberg strain with trimethoprim/sulfamethoxazole resistance, in which sul genes but no known dfr genes were detected.To characterize and describe the molecular mechanism of trimethoprim resistance in a Salmonella Heidelberg outbreak isolate.Illumina sequencing data for one outbreak isolate revealed a 588?bp ORF encoding a putative dfr gene. This gene was cloned into Escherichia coli and resistance to trimethoprim was measured by broth dilution and Etest. Phylogenetic analysis of previously reported dfrA genes was performed using MEGA. Long-read sequencing was conducted to determine the context of the novel dfr gene.The novel dfr gene, named dfrA34, conferred trimethoprim resistance (MIC =32?mg/L) when cloned into E. coli. Based on predicted amino acid sequences, dfrA34 shares less than 50% identity with other known dfrA genes. The dfrA34 gene is located in a class 1 integron in a multiresistance region of an IncC plasmid, adjacent to a sul gene, thus conferring clinical trimethoprim/sulfamethoxazole resistance. Additionally, dfrA34 is associated with ISCR1, enabling easy transmission between other plasmids and bacterial strains.


April 21, 2020

Intestinibaculum porci gen. nov., sp. nov., a new member of the family Erysipelotrichaceae isolated from the small intestine of a swine.

A strictly anaerobic, Gram-stain-positive, catalase-negative, non-motile, rod-shaped bacterium, designated SG0102T, was isolated from the small intestine of a swine. Optimal growth occurred at 37°C and pH 7.0. Furthermore, growth was observed in the presence of up to 3% (w/v) NaCl but not at salinity levels higher than 4%. The comparative analysis of 16S rRNA gene sequences showed that strain SG0102T was most closely related to Kandleria vitulina DSM 20405T (93.3%), followed by Catenibacterium mitsuokai KCTC 5053T (91.1%), Sharpea azabuensis KCTC 15217T (91.0%), and Eggerthia catenaformis DSM 5348T (89.6%). The average nucleotide identity values between strain SG0102T and related species, K. vitulina DSM 20405T, C. mitsuokai KCTC 5053T, S. azabuensis KCTC 15217T, and E. catenaformis DSM 5348T, were 71.0, 69.3, 70.0, and 69.2%, respectively. The phylogenetic analysis based on 16S rRNA gene sequence revealed that strain SG0102T belonged to the family Erysipelotrichaceae in the class Erysipelotrichia. The DNA G + C content of the strain SG0102T was 39.5 mol%. The major cellular fatty acids (> 10%) of strain SG0102T were C16:0, C16:0 dimethyl acetal, and C18:2?9/12c. The cell wall peptidoglycan of strain SG0102T contained the meso-diaminopimelic acid. The strain SG0102T produced lactic acid as a major end product of fermentation. These distinct phenotypic and phylogenetic properties suggest that strain SG0102T represents a novel species in a novel genus of the family Erysipelotrichaceae, for which the name Intestinibaculum porci gen. nov. sp. nov. is proposed. The type strain is SG0102T (= KCTC 15725T = NBRC 113396T).


April 21, 2020

Phased genome sequence of an interspecific hybrid flowering cherry, ‘Somei-Yoshino’ (Cerasus × yedoensis).

We report the phased genome sequence of an interspecific hybrid, the flowering cherry ‘Somei-Yoshino’ (Cerasus × yedoensis). The sequence data were obtained by single-molecule real-time sequencing technology, split into two subsets based on genome information of the two probable ancestors, and assembled to obtain two haplotype phased genome sequences of the interspecific hybrid. The resultant genome assembly consisting of the two haplotype sequences spanned 690.1 Mb with 4,552 contigs and an N50 length of 1.0 Mb. We predicted 95,076 high-confidence genes, including 94.9% of the core eukaryotic genes. Based on a high-density genetic map, we established a pair of eight pseudomolecule sequences, with highly conserved structures between the two haplotype sequences with 2.4 million sequence variants. A whole genome resequencing analysis of flowering cherries suggested that ‘Somei-Yoshino’ might be derived from a cross between C. spachiana and either C. speciosa or its relatives. A time-course transcriptome analysis of floral buds and flowers suggested comprehensive changes in gene expression in floral bud development towards flowering. These genome and transcriptome data are expected to provide insights into the evolution and cultivation of flowering cherry and the molecular mechanism underlying flowering. © The Author(s) 2019. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.


April 21, 2020

Transcriptomic response of Escherichia coli O157 isolates on meat: Comparison between a typical Australian isolate from cattle and a pathogenic clinical isolate

The majority of foodborne illnesses associated with E. coli O157 are attributed to the consumption of foods of bovine origin. In this study, RNA-Seq experiments were undertaken with E. coli O157 to identify genes that may be associated with growth and survival on meat and the beef carcass at low temperature. In addition, the response of an E. coli O157 isolate representative of the general genetic ‘type’ found in Australia (E. coli O157:H- strain EC2422) was compared to that of a pathogenic clinical isolate (E. coli O157:H7 strain Sakai) not typically found in Australia. Both strains up-regulated genes involved in the acid stress response, cold shock response, quorum sensing, biofilm formation and Shiga toxin production. Differences were also observed, with E. coli O157:H7 Sakai up-regulating genes playing a critical role in the barrier function of the outer membrane, lipopolysaccharide biosynthesis, extracellular polysaccharide synthesis and curli production. In contrast, E. coli O157:H- EC2422 down-regulated genes involved in peptidoglycan biosynthesis and of the primary envelope stress response Cpx system. The unique gene expression profiles of the strains, indicate that these genotypes may differ in their ability to persist in the meat production environment and therefore also in their ability to cause disease.


April 21, 2020

Characterization of a Novel Insecticidal Protein Cry9Cb1 from Bacillus thuringiensis.

In recent decades, there have been increasing reports of insect resistance in Bacillus thuringiensis (Bt) crops. Alternative use of Cry toxins, with high insecticidal activity and different mechanisms of action, may be an important strategy to manage this resistance. Cry9 protein, with high toxicity to the lepidopteran pests and no cross-resistance with commercial Cry1 proteins, is a valuable relevant resource. A novel insecticidal protein, MP1489, subsequently named as Cry9Cb1, with 88% amino acid sequence identity with Cry9Ca1, was identified from Bt strain SP663; it exhibited high insecticidal activity against Plutella xylostella, Ostrinia furnacalis, and Chilo suppressalis and no cross-resistance with Cry1Fa in Ostrinia furnacalis. Its minimal active fragments against Plutella xylostella and Ostrinia furnacalis were identified to be 72T-657V and 68D-655A, respectively; food-safety assessment showed no sequence homology with any known allergen and rapid degradation and inactivation by both heat and the gastrointestinal environment. Therefore, Cry9Cb1 is proposed to have a brilliant prospect as an insecticidal protein in agriculture.


April 21, 2020

Genome assembly and gene expression in the American black bear provides new insights into the renal response to hibernation.

The prevalence of chronic kidney disease (CKD) is rising worldwide and 10-15% of the global population currently suffers from CKD and its complications. Given the increasing prevalence of CKD there is an urgent need to find novel treatment options. The American black bear (Ursus americanus) copes with months of lowered kidney function and metabolism during hibernation without the devastating effects on metabolism and other consequences observed in humans. In a biomimetic approach to better understand kidney adaptations and physiology in hibernating black bears, we established a high-quality genome assembly. Subsequent RNA-Seq analysis of kidneys comparing gene expression profiles in black bears entering (late fall) and emerging (early spring) from hibernation identified 169 protein-coding genes that were differentially expressed. Of these, 101 genes were downregulated and 68 genes were upregulated after hibernation. Fold changes ranged from 1.8-fold downregulation (RTN4RL2) to 2.4-fold upregulation (CISH). Most notable was the upregulation of cytokine suppression genes (SOCS2, CISH, and SERPINC1) and the lack of increased expression of cytokines and genes involved in inflammation. The identification of these differences in gene expression in the black bear kidney may provide new insights in the prevention and treatment of CKD. © The Author(s) 2018. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.


April 21, 2020

Nodule bacteria from the cultured legume Phaseolus dumosus (belonging to the Phaseolus vulgaris cross-inoculation group) with common tropici phenotypic characteristics and symbiovar but distinctive phylogenomic position and chromid.

Phaseolus dumosus is an endemic species from mountain tops in Mexico that was found in traditional agriculture areas in Veracruz, Mexico. P. dumosus plants were identified by ITS sequences and their nodules were collected from agricultural fields or from trap plant experiments in the laboratory. Bacteria from P. dumosus nodules were identified as belonging to the phaseoli-etli-leguminosarum (PEL) or to the tropici group by 16S rRNA gene sequences. We obtained complete closed genomes from two P. dumosus isolates CCGE531 and CCGE532 that were phylogenetically placed within the tropici group but with a distinctive phylogenomic position and low average nucleotide identity (ANI). CCGE531 and CCGE532 had common phenotypic characteristics with tropici type B rhizobial symbionts. Genome synteny analysis and ANI showed that P. dumosus isolates had different chromids and our analysis suggests that chromids have independently evolved in different lineages of the Rhizobium genus. Finally, we considered that P. dumosus and Phaseolus vulgaris plants belong to the same cross-inoculation group since they have conserved symbiotic affinites for rhizobia.Copyright © 2018 Elsevier GmbH. All rights reserved.


April 21, 2020

Potential for Adaptation to Climate Change Through Genomic Breeding in Sesame

Sesame is an important oilseed crop with high oil content and oil quality. Abundant unsaturated fatty acids, proteins, and antioxidants in sesame seeds attract the worldwide consumption of sesame products. Sesame is highly tolerant of drought and poor soil condition, even though it is readily affected by diseases and waterlogging stress, thereby leading to reduced seed yield and quality. For sesame, increasing the high and stable yield is requisite and urgent. Meanwhile, it is necessary to increase the mechanization level of its harvest for the world’s sesame production. Sesame, S. indicum, is the sole cultivated species in Sesamum genus. The relatively low genetic diversity limits sesame breeding for new and substantial improved varieties. In this section, we present a review of the key agronomic traits and the breeding methods currently used in the species. We also pinpoint the achievement of the Sesame Genome Project (SGP) and the potential for the genomics-assisted breeding in sesame.


April 21, 2020

RADAR-seq: A RAre DAmage and Repair sequencing method for detecting DNA damage on a genome-wide scale.

RAre DAmage and Repair sequencing (RADAR-seq) is a highly adaptable sequencing method that enables the identification and detection of rare DNA damage events for a wide variety of DNA lesions at single-molecule resolution on a genome-wide scale. In RADAR-seq, DNA lesions are replaced with a patch of modified bases that can be directly detected by Pacific Biosciences Single Molecule Real-Time (SMRT) sequencing. RADAR-seq enables dynamic detection over a wide range of DNA damage frequencies, including low physiological levels. Furthermore, without the need for DNA amplification and enrichment steps, RADAR-seq provides sequencing coverage of damaged and undamaged DNA across an entire genome. Here, we use RADAR-seq to measure the frequency and map the location of ribonucleotides in wild-type and RNaseH2-deficient E. coli and Thermococcus kodakarensis strains. Additionally, by tracking ribonucleotides incorporated during in vivo lagging strand DNA synthesis, we determined the replication initiation point in E. coli, and its relation to the origin of replication (oriC). RADAR-seq was also used to map cyclobutane pyrimidine dimers (CPDs) in Escherichia coli (E. coli) genomic DNA exposed to UV-radiation. On a broader scale, RADAR-seq can be applied to understand formation and repair of DNA damage, the correlation between DNA damage and disease initiation and progression, and complex biological pathways, including DNA replication.Copyright © 2019 The Authors. Published by Elsevier B.V. All rights reserved.


April 21, 2020

Secretion of an Argonaute protein by a parasitic nematode and the evolution of its siRNA guides.

Extracellular RNA has been proposed to mediate communication between cells and organisms however relatively little is understood regarding how specific sequences are selected for export. Here, we describe a specific Argonaute protein (exWAGO) that is secreted in extracellular vesicles (EVs) released by the gastrointestinal nematode Heligmosomoides bakeri, at multiple copies per EV. Phylogenetic and gene expression analyses demonstrate exWAGO orthologues are highly conserved and abundantly expressed in related parasites but highly diverged in free-living genus Caenorhabditis. We show that the most abundant small RNAs released from the nematode parasite are not microRNAs as previously thought, but rather secondary small interfering RNAs (siRNAs) that are produced by RNA-dependent RNA Polymerases. The siRNAs that are released in EVs have distinct evolutionary properties compared to those resident in free-living or parasitic nematodes. Immunoprecipitation of exWAGO demonstrates that it specifically associates with siRNAs from transposons and newly evolved repetitive elements that are packaged in EVs and released into the host environment. Together this work demonstrates molecular and evolutionary selectivity in the small RNA sequences that are released in EVs into the host environment and identifies a novel Argonaute protein as the mediator of this. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020

A 12-kb structural variation in progressive myoclonic epilepsy was newly identified by long-read whole-genome sequencing.

We report a family with progressive myoclonic epilepsy who underwent whole-exome sequencing but was negative for pathogenic variants. Similar clinical courses of a devastating neurodegenerative phenotype of two affected siblings were highly suggestive of a genetic etiology, which indicates that the survey of genetic variation by whole-exome sequencing was not comprehensive. To investigate the presence of a variant that remained unrecognized by standard genetic testing, PacBio long-read sequencing was performed. Structural variant (SV) detection using low-coverage (6×) whole-genome sequencing called 17,165 SVs (7,216 deletions and 9,949 insertions). Our SV selection narrowed down potential candidates to only five SVs (two deletions and three insertions) on the genes tagged with autosomal recessive phenotypes. Among them, a 12.4-kb deletion involving the CLN6 gene was the top candidate because its homozygous abnormalities cause neuronal ceroid lipofuscinosis. This deletion included the initiation codon and was found in a GC-rich region containing multiple repetitive elements. These results indicate the presence of a causal variant in a difficult-to-sequence region and suggest that such variants that remain enigmatic after the application of current whole-exome sequencing technology could be uncovered by unbiased application of long-read whole-genome sequencing.


April 21, 2020

Characterization of the genome of a Nocardia strain isolated from soils in the Qinghai-Tibetan Plateau that specifically degrades crude oil and of this biodegradation.

A strain of Nocardia isolated from crude oil-contaminated soils in the Qinghai-Tibetan Plateau degrades nearly all components of crude oil. This strain was identified as Nocardia soli Y48, and its growth conditions were determined. Complete genome sequencing showed that N. soli Y48 has a 7.3?Mb genome and many genes responsible for hydrocarbon degradation, biosurfactant synthesis, emulsification and other hydrocarbon degradation-related metabolisms. Analysis of the clusters of orthologous groups (COGs) and genomic islands (GIs) revealed that Y48 has undergone significant gene transfer events to adapt to changing environmental conditions (crude oil contamination). The structural features of the genome might provide a competitive edge for the survival of N. soli Y48 in oil-polluted environments and reflect the adaptation of coexisting bacteria to distinct nutritional niches.Copyright © 2018. Published by Elsevier Inc.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.