April 21, 2020  |  

Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads.

Tandemly repeated DNA is highly mutable and causes at least 31 diseases, but it is hard to detect pathogenic repeat expansions genome-wide. Here, we report robust detection of human repeat expansions from careful alignments of long but error-prone (PacBio and nanopore) reads to a reference genome. Our method is robust to systematic sequencing errors, inexact repeats with fuzzy boundaries, and low sequencing coverage. By comparing to healthy controls, we prioritize pathogenic expansions within the top 10 out of 700,000 tandem repeats in whole genome sequencing data. This may help to elucidate the many genetic diseases whose causes remain unknown.


April 21, 2020  |  

Sequencing a Juglans regia?×?J. microcarpa hybrid yields high-quality genome assemblies of parental species.

Members of the genus Juglans are monecious wind-pollinated trees in the family Juglandaceae with highly heterozygous genomes, which greatly complicates genome sequence assembly. The genomes of interspecific hybrids are usually comprised of haploid genomes of parental species. We exploited this attribute of interspecific hybrids to avoid heterozygosity and sequenced an interspecific hybrid Juglans microcarpa?×?J. regia using a novel combination of single-molecule sequencing and optical genome mapping technologies. The resulting assemblies of both genomes were remarkably complete including chromosome termini and centromere regions. Chromosome termini consisted of arrays of telomeric repeats about 8?kb long and heterochromatic subtelomeric regions about 10?kb long. The centromeres consisted of arrays of a centromere-specific Gypsy retrotransposon and most contained genes, many of them transcribed. Juglans genomes evolved by a whole-genome-duplication dating back to the Cretaceous-Paleogene boundary and consist of two subgenomes, which were fractionated by numerous short gene deletions evenly distributed along the length of the chromosomes. Fractionation was shown to be asymmetric with one subgenome exhibiting greater gene loss than the other. The asymmetry of the process is ongoing and mirrors an asymmetry in gene expression between the subgenomes. Given the importance of J. microcarpa?×?J. regia hybrids as potential walnut rootstocks, we catalogued disease resistance genes in the parental genomes and studied their chromosomal distribution. We also estimated the molecular clock rates for woody perennials and deployed them in estimating divergence times of Juglans genomes and those of other woody perennials.


April 21, 2020  |  

Comparative Genomic Analyses Reveal Core-Genome-Wide Genes Under Positive Selection and Major Regulatory Hubs in Outlier Strains of Pseudomonas aeruginosa.

Genomic information for outlier strains of Pseudomonas aeruginosa is exiguous when compared with classical strains. We sequenced and constructed the complete genome of an environmental strain CR1 of P. aeruginosa and performed the comparative genomic analysis. It clustered with the outlier group, hence we scaled up the analyses to understand the differences in environmental and clinical outlier strains. We identified eight new regions of genomic plasticity and a plasmid pCR1 with a VirB/D4 complex followed by trimeric auto-transporter that can induce virulence phenotype in the genome of strain CR1. Virulence genotype analysis revealed that strain CR1 lacked hemolytic phospholipase C and D, three genes for LPS biosynthesis and had reduced antibiotic resistance genes when compared with clinical strains. Genes belonging to proteases, bacterial exporters and DNA stabilization were found to be under strong positive selection, thus facilitating pathogenicity and survival of the outliers. The outliers had the complete operon for the production of vibrioferrin, a siderophore present in plant growth promoting bacteria. The competence to acquire multidrug resistance and new virulence factors makes these strains a potential threat. However, we identified major regulatory hubs that can be used as drug targets against both the classical and outlier groups.


April 21, 2020  |  

Genome plasticity favours double chromosomal Tn4401b-blaKPC-2 transposon insertion in the Pseudomonas aeruginosa ST235 clone.

Pseudomonas aeruginosa Sequence Type 235 is a clone that possesses an extraordinary ability to acquire mobile genetic elements and has been associated with the spread of resistance genes, including genes that encode for carbapenemases. Here, we aim to characterize the genetic platforms involved in resistance dissemination in blaKPC-2-positive P. aeruginosa ST235 in Colombia.In a prospective surveillance study of infections in adult patients attended in five ICUs in five distant cities in Colombia, 58 isolates of P. aeruginosa were recovered, of which, 27 (46.6%) were resistant to carbapenems. The molecular analysis showed that 6 (22.2%) and 4 (14.8%) isolates harboured the blaVIM and blaKPC-2 genes, respectively. The four blaKPC-2-positive isolates showed a similar PFGE pulsotype and belonged to ST235. Complete genome sequencing of a representative ST235 isolate shows a unique chromosomal contig of 7097.241?bp with eight different resistance genes identified and five transposons: a Tn6162-like with ant(2?)-Ia, two Tn402-like with ant(3?)-Ia and blaOXA-2 and two Tn4401b with blaKPC-2. All transposons were inserted into the genomic islands. Interestingly, the two Tn4401b copies harbouring blaKPC-2 were adjacently inserted into a new genomic island (PAGI-17) with traces of a replicative transposition process. This double insertion was probably driven by several structural changes within the chromosomal region containing PAGI-17 in the ST235 background.This is the first report of a double Tn4401b chromosomal insertion in P. aeruginosa, just within a new genomic island (PAGI-17). This finding indicates once again the great genomic plasticity of this microorganism.


April 21, 2020  |  

Comparative genomic and phylogenetic analyses of Populus section Leuce using complete chloroplast genome sequences

Species of Populus section Leuce are distributed throughout most parts of the Northern Hemisphere and have important economic and ecological significance. However, due to frequent hybridization within Leuce, the phylogenetic relationship between species has not been clarified. The chloroplast (cp) genome is characterized by maternal inheritance and relatively conservative mutation rates; thus, it is a powerful tool for building phylogenetic trees. In this study, we used the PacBio SEQUEL software to determine that the cp genome of Populus tomentosa has a length of 156,558 bp including a long single-copy region (84,717 bp), a small single-copy region (16,555 bp), and a pair of inverted repeat regions (27,643 bp). The cp genome contains 131 unique genes, including 37 transfer RNAs, 8 ribosomal RNAs, and 86 protein-coding genes. We compared the cp genomes of seven species of section Leuce and identified five cp DNA markers with >?1% variable sites. Phylogenetic analyses revealed two evolutionary branches for section Leuce. The species with the closest relationship with P. tomenstosa was P. adenopoda, followed by P. alba. These cp genome data will help to determine the cp evolution of section Leuce and further elucidate the origin of P. tomentosa.


April 21, 2020  |  

Direct pathway cloning of the sodorifen biosynthetic gene cluster and recombinant generation of its product in E. coli.

Serratia plymuthica WS3236 was selected for whole genome sequencing based on preliminary genetic and chemical screening indicating the presence of multiple natural product pathways. This led to the identification of a putative sodorifen biosynthetic gene cluster (BGC). The natural product sodorifen is a volatile organic compound (VOC) with an unusual polymethylated hydrocarbon bicyclic structure (C16H26) produced by selected strains of S. plymuthica. The BGC encoding sodorifen consists of four genes, two of which (sodA, sodB) are homologs of genes encoding enzymes of the non-mevalonate pathway and are thought to enhance the amounts of available farnesyl pyrophosphate (FPP), the precursor of sodorifen. Proceeding from FPP, only two enzymes are necessary to produce sodorifen: an S-adenosyl methionine dependent methyltransferase (SodC) with additional cyclisation activity and a terpene-cyclase (SodD). Previous analysis of S. plymuthica found sodorifen production titers are generally low and vary significantly among different producer strains. This precludes studies on the still elusive biological function of this structurally and biosynthetically fascinating bacterial terpene.Sequencing and mining of the S. plymuthica WS3236 genome revealed the presence of 38 BGCs according to antiSMASH analysis, including a putative sodorifen BGC. Further genome mining for sodorifen and sodorifen-like BGCs throughout bacteria was performed using SodC and SodD as queries and identified a total of 28 sod-like gene clusters. Using direct pathway cloning (DiPaC) we intercepted the 4.6 kb candidate sodorifen BGC from S. plymuthica WS3236 (sodA-D) and transformed it into Escherichia coli BL21. Heterologous expression under the control of the tetracycline inducible PtetO promoter firmly linked this BGC to sodorifen production. By utilizing this newly established expression system, we increased the production yields by approximately 26-fold when compared to the native producer. In addition, sodorifen was easily isolated in high purity by simple head-space sampling.Genome mining of all available genomes within the NCBI and JGI IMG databases led to the identification of a wealth of sod-like pathways which may be responsible for producing a range of structurally unknown sodorifen analogs. Introduction of the S. plymuthica WS3236 sodorifen BGC into the fast-growing heterologous expression host E. coli with a very low VOC background led to a significant increase in both sodorifen product yield and purity compared to the native producer. By providing a reliable, high-level production system, this study sets the stage for future investigations of the biological role and function of sodorifen and for functionally unlocking the bioinformatically identified putative sod-like pathways.


April 21, 2020  |  

Full-length transcript sequencing and comparative transcriptomic analysis to evaluate the contribution of osmotic and ionic stress components towards salinity tolerance in the roots of cultivated alfalfa (Medicago sativa L.).

Alfalfa is the most extensively cultivated forage legume. Salinity is a major environmental factor that impacts on alfalfa’s productivity. However, little is known about the molecular mechanisms underlying alfalfa responses to salinity, especially the relative contribution of the two important components of osmotic and ionic stress.In this study, we constructed the first full-length transcriptome database for alfalfa root tips under continuous NaCl and mannitol treatments for 1, 3, 6, 12, and 24?h (three biological replicates for each time points, including the control group) via PacBio Iso-Seq. This resulted in the identification of 52,787 full-length transcripts, with an average length of 2551?bp. Global transcriptional changes in the same 33 stressed samples were then analyzed via BGISEQ-500 RNA-Seq. Totals of 8861 NaCl-regulated and 8016 mannitol-regulated differentially expressed genes (DEGs) were identified. Metabolic analyses revealed that these DEGs overlapped or diverged in the cascades of molecular networks involved in signal perception, signal transduction, transcriptional regulation, and antioxidative defense. Notably, several well characterized signalling pathways, such as CDPK, MAPK, CIPK, and PYL-PP2C-SnRK2, were shown to be involved in osmotic stress, while the SOS core pathway was activated by ionic stress. Moreover, the physiological shifts of catalase and peroxidase activity, glutathione and proline content were in accordance with dynamic transcript profiles of the relevant genes, indicating that antioxidative defense system plays critical roles in response to salinity stress.Overall, our study provides evidence that the response to salinity stress in alfalfa includes both osmotic and ionic components. The key osmotic and ionic stress-related genes are candidates for future studies as potential targets to improve resistance to salinity stress via genetic engineering.


April 21, 2020  |  

Differential retention of transposable element-derived sequences in outcrossing Arabidopsis genomes.

Transposable elements (TEs) are genomic parasites with major impacts on host genome architecture and host adaptation. A proper evaluation of their evolutionary significance has been hampered by the paucity of short scale phylogenetic comparisons between closely related species. Here, we characterized the dynamics of TE accumulation at the micro-evolutionary scale by comparing two closely related plant species, Arabidopsis lyrata and A. halleri.Joint genome annotation in these two outcrossing species confirmed that both contain two distinct populations of TEs with either ‘recent’ or ‘old’ insertion histories. Identification of rare segregating insertions suggests that diverse TE families contribute to the ongoing dynamics of TE accumulation in the two species. Orthologous TE fragments (i.e. those that have been maintained in both species), tend to be located closer to genes than those that are retained in one species only. Compared to non-orthologous TE insertions, those that are orthologous tend to produce fewer short interfering RNAs, are less heavily methylated when found within or adjacent to genes and these tend to have lower expression levels. These findings suggest that long-term retention of TE insertions reflects their frequent acquisition of adaptive roles and/or the deleterious effects of removing nearly neutral TE insertions when they are close to genes.Our results indicate a rapid evolutionary dynamics of the TE landscape in these two outcrossing species, with an important input of a diverse set of new insertions with variable propensity to resist deletion.


April 21, 2020  |  

Construction of JRG (Japanese reference genome) with single-molecule real-time sequencing

In recent genome analyses, population-specific reference panels have indicated important. However, reference panels based on short-read sequencing data do not sufficiently cover long insertions. Therefore, the nature of long insertions has not been well documented. Here, we assembled a Japanese genome using single-molecule real-time sequencing data and characterized insertions found in the assembled genome. We identified 3691 insertions ranging from 100?bps to ~10,000?bps in the assembled genome relative to the international reference sequence (GRCh38). To validate and characterize these insertions, we mapped short-reads from 1070 Japanese individuals and 728 individuals from eight other populations to insertions integrated into GRCh38. With this result, we constructed JRGv1 (Japanese Reference Genome version 1) by integrating the 903 verified insertions, totaling 1,086,173 bases, shared by at least two Japanese individuals into GRCh38. We also constructed decoyJRGv1 by concatenating 3559 verified insertions, totaling 2,536,870 bases, shared by at least two Japanese individuals or by six other assemblies. This assembly improved the alignment ratio by 0.4% on average. These results demonstrate the importance of refining the reference assembly and creating a population-specific reference genome. JRGv1 and decoyJRGv1 are available at the JRG website.


April 21, 2020  |  

Retrotranspositional landscape of Asian rice revealed by 3000 genomes.

The recent release of genomic sequences for 3000 rice varieties provides access to the genetic diversity at species level for this crop. We take advantage of this resource to unravel some features of the retrotranspositional landscape of rice. We develop software TRACKPOSON specifically for the detection of transposable elements insertion polymorphisms (TIPs) from large datasets. We apply this tool to 32 families of retrotransposons and identify more than 50,000 TIPs in the 3000 rice genomes. Most polymorphisms are found at very low frequency, suggesting that they may have occurred recently in agro. A genome-wide association study shows that these activations in rice may be triggered by external stimuli, rather than by the alteration of genetic factors involved in transposable element silencing pathways. Finally, the TIPs dataset is used to trace the origin of rice domestication. Our results suggest that rice originated from three distinct domestication events.


April 21, 2020  |  

Genome sequence and transcriptomic profiles of a marine bacterium, Pseudoalteromonas agarivorans Hao 2018.

Members of the marine genus Pseudoalteromonas have attracted great interest because of their ability to produce a large number of biologically active substances. Here, we report the complete genome sequence of Pseudoalteromonas agarivorans Hao 2018, a strain isolated from an abalone breeding environment, using second-generation Illumina and third-generation PacBio sequencing technologies. Illumina sequencing offers high quality and short reads, while PacBio technology generates long reads. The scaffolds of the two platforms were assembled to yield a complete genome sequence that included two circular chromosomes and one circular plasmid. Transcriptomic data for Pseudoalteromonas were not available. We therefore collected comprehensive RNA-seq data using Illumina sequencing technology from a fermentation culture of P. agarivorans Hao 2018. Researchers studying the evolution, environmental adaptations and biotechnological applications of Pseudoalteromonas may benefit from our genomic and transcriptomic data to analyze the function and expression of genes of interest.


April 21, 2020  |  

Origin and recent expansion of an endogenous gammaretroviral lineage in domestic and wild canids.

Vertebrate genomes contain a record of retroviruses that invaded the germlines of ancestral hosts and are passed to offspring as endogenous retroviruses (ERVs). ERVs can impact host function since they contain the necessary sequences for expression within the host. Dogs are an important system for the study of disease and evolution, yet no substantiated reports of infectious retroviruses in dogs exist. Here, we utilized Illumina whole genome sequence data to assess the origin and evolution of a recently active gammaretroviral lineage in domestic and wild canids.We identified numerous recently integrated loci of a canid-specific ERV-Fc sublineage within Canis, including 58 insertions that were absent from the reference assembly. Insertions were found throughout the dog genome including within and near gene models. By comparison of orthologous occupied sites, we characterized element prevalence across 332 genomes including all nine extant canid species, revealing evolutionary patterns of ERV-Fc segregation among species as well as subpopulations.Sequence analysis revealed common disruptive mutations, suggesting a predominant form of ERV-Fc spread by trans complementation of defective proviruses. ERV-Fc activity included multiple circulating variants that infected canid ancestors from the last 20 million to within 1.6 million years, with recent bursts of germline invasion in the sublineage leading to wolves and dogs.


April 21, 2020  |  

Comprehensive analysis of full genome sequence and Bd-milRNA/target mRNAs to discover the mechanism of hypovirulence in Botryosphaeria dothidea strains on pear infection with BdCV1 and BdPV1

Pear ring rot disease, mainly caused by Botryosphaeria dothidea, is widespread in most pear and apple-growing regions. Mycoviruses are used for biocontrol, especially in fruit tree disease. BdCV1 (Botryosphaeria dothidea chrysovirus 1) and BdPV1 (Botryosphaeria dothidea partitivirus 1) influence the biological characteristics of B. dothidea strains. BdCV1 is a potential candidate for the control of fungal disease. Therefore, it is vital to explore interactions between B. dothidea and mycovirus to clarify the pathogenic mechanisms of B. dothidea and hypovirulence of B. dothidea in pear. A high-quality full-length genome sequence of the B. dothidea LW-Hubei isolate was obtained using Single Molecule Real-Time sequencing. It has high repeat sequence with 9.3% and DNA methylation existence in the genome. The 46.34?Mb genomes contained 14,091 predicted genes, which of 13,135 were annotated. B. dothidea was predicted to express 3833 secreted proteins. In bioinformatics analysis, 351 CAZy members, 552 transporters, 128 kinases, and 1096 proteins associated with plant-host interaction (PHI) were identified. RNA-silencing components including two endoribonuclease Dicer, four argonaute (Ago) and three RNA-dependent RNA polymerase (RdRp) molecules were identified and expressed in response to mycovirus infection. Horizontal transfer of the LW-C and LW-P strains indicated that BdCV1 induced host gene silencing in LW-C to suppress BdPV1 transmission. To investigate the role of RNA-silencing in B. dothidea defense, we constructed four small RNA libraries and sequenced B. dothidea micro-like RNAs (Bd-milRNAs) produced in response to BdCV1 and BdPV1 infection. Among these, 167 conserved and 68 candidate novel Bd-milRNAs were identified, of which 161 conserved and 20 novel Bd-milRNA were differentially expressed. WEGO analysis revealed involvement of the differentially expressed Bd-milRNA-targeted genes in metabolic process, catalytic activity, cell process and response to stress or stimulus. BdCV1 had a greater effect on the phenotype, virulence, conidiomata, vertical and horizontal transmission ability, and mycelia cellular structure biological characteristics of B. dothidea strains than BdPV1 and virus-free strains. The results obtained in this study indicate that mycovirus regulates biological processes in B. dothidea through the combined interaction of antiviral defense mediated by RNA-silencing and milRNA-mediated regulation of target gene mRNA expression.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.