Long read sequencing Archives - Page 54 of 57

April 21, 2020

Genomic Analyses Reveal Evidence of Independent Evolution, Demographic History, and Extreme Environment Adaptation of Tibetan Plateau Agaricus bisporus.

Agaricus bisporus distributed in the Tibetan Plateau of China has high-stress resistance that is valuable for breeding improvements. However, its evolutionary history, specialization, and adaptation to the extreme Tibetan Plateau environment are largely unknown. Here, we performed de novo genome sequencing of a representative Tibetan Plateau wild strain ABM and comparative genomic analysis with the reported European strain H97 and H39. The assembled ABM genome was 30.4 Mb in size, and comprised 8,562 protein-coding genes. The ABM genome shared highly conserved syntenic blocks and a few inversions with H97 and H39. The phylogenetic tree constructed by 1,276 single-copy orthologous genes in nine fungal species showed that the Tibetan Plateau and European A. bisporus diverged ~5.5 million years ago. Population genomic analysis using genome resequencing of 29 strains revealed that the Tibetan Plateau population underwent significant differentiation from the European and American populations and evolved independently, and the global climate changes critically shaped the demographic history of the Tibetan Plateau population. Moreover, we identified key genes that are related to the cell wall and membrane system, and the development and defense systems regulated A. bisporus adapting to the harsh Tibetan Plateau environment. These findings highlight the value of genomic data in assessing the evolution and adaptation of mushrooms and will enhance future genetic improvements of A. bisporus.

April 21, 2020

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50?bp) and 27,622 SVs (=50?bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.

April 21, 2020

Occurrence and Characterization of mcr-1-Positive Escherichia coli Isolated From Food-Producing Animals in Poland, 2011-2016.

The emergence of plasmid-mediated colistin resistance (mcr genes) threatens the effectiveness of polymyxins, which are last-resort drugs to treat infections by multidrug- and carbapenem-resistant Gram-negative bacteria. Based on the occurrence of colistin resistance the aims of the study were to determine possible resistance mechanisms and then characterize the mcr-positive Escherichia coli. The research used material from the Polish national and EU harmonized antimicrobial resistance (AMR) monitoring programs. A total of 5,878 commensal E. coli from fecal samples of turkeys, chickens, pigs, and cattle collected in 2011-2016 were screened by minimum inhibitory concentration (MIC) determination for the presence of resistance to colistin (R) defined as R > 2 mg/L. Strains with MIC = 2 mg/L isolated in 2014-2016 were also included. A total of 128 isolates were obtained, and most (66.3%) had colistin MIC of 2 mg/L. PCR revealed mcr-1 in 80 (62.5%) isolates recovered from 61 turkeys, 11 broilers, 2 laying hens, 1 pig, and 1 bovine. No other mcr-type genes (including mcr-2 to -5) were detected. Whole-genome sequencing (WGS) of the mcr-1-positive isolates showed high diversity in the multi-locus sequence types (MLST) of E. coli, plasmid replicons, and AMR and virulence genes. Generally mcr-1.1 was detected on the same contig as the IncX4 (76.3%) and IncHI2 (6.3%) replicons. One isolate harbored mcr-1.1 on the chromosome. Various extended-spectrum beta-lactamase (blaSHV-12, blaCTX-M-1, blaCTX-M-15, blaTEM-30, blaTEM-52, and blaTEM-135) and quinolone resistance genes (qnrS1, qnrB19, and chromosomal gyrA, parC, and parE mutations) were present in the mcr-1.1-positive E. coli. A total of 49 sequence types (ST) were identified, ST354, ST359, ST48, and ST617 predominating. One isolate, identified as ST189, belonged to atypical enteropathogenic E. coli. Our findings show that mcr-1.1 has spread widely among production animals in Poland, particularly in turkeys and appears to be transferable mainly by IncX4 and IncHI2 plasmids spread across diverse E. coli lineages. Interestingly, most of these mcr-1-positive E. coli would remain undetected using phenotypic methods with the current epidemiological cut-off value (ECOFF). The appearance and spread of mcr-1 among various animals, but notably in turkeys, might be considered a food chain, and public health hazard.

April 21, 2020

Efomycins K and L From a Termite-Associated Streptomyces sp. M56 and Their Putative Biosynthetic Origin.

Two new elaiophylin derivatives, efomycins K (1) and L (2), and five known elaiophylin derivatives (3-7) were isolated from the termite-associated Streptomyces sp. M56. The structures were determined by 1D and 2D NMR and HR-ESIMS analyses and comparative CD spectroscopy. The putative gene cluster responsible for the production of the elaiophylin and efomycin derivatives was identified based on significant homology to related clusters. Phylogenetic analysis of gene cluster domains was used to provide a biosynthetic rational for these new derivatives and to demonstrate how a single biosynthetic pathway can produce diverse structures.

April 21, 2020

Comparative Genomics of Marine Sponge-Derived Streptomyces spp. Isolates SM17 and SM18 With Their Closest Terrestrial Relatives Provides Novel Insights Into Environmental Niche Adaptations and Secondary Metabolite Biosynthesis Potential.

The emergence of antibiotic resistant microorganisms has led to an increased need for the discovery and development of novel antimicrobial compounds. Frequent rediscovery of the same natural products (NPs) continues to decrease the likelihood of the discovery of new compounds from soil bacteria. Thus, efforts have shifted toward investigating microorganisms and their secondary metabolite biosynthesis potential, from diverse niche environments, such as those isolated from marine sponges. Here we investigated at the genomic level two Streptomyces spp. strains, namely SM17 and SM18, isolated from the marine sponge Haliclona simulans, with previously reported antimicrobial activity against clinically relevant pathogens; using single molecule real-time (SMRT) sequencing. We performed a series of comparative genomic analyses on SM17 and SM18 with their closest terrestrial relatives, namely S. albus J1074 and S. pratensis ATCC 33331 respectively; in an effort to provide further insights into potential environmental niche adaptations (ENAs) of marine sponge-associated Streptomyces, and on how these adaptations might be linked to their secondary metabolite biosynthesis potential. Prediction of secondary metabolite biosynthetic gene clusters (smBGCs) indicated that, even though the marine isolates are closely related to their terrestrial counterparts at a genomic level; they potentially produce different compounds. SM17 and SM18 displayed a better ability to grow in high salinity medium when compared to their terrestrial counterparts, and further analysis of their genomes indicated that they possess a pool of 29 potential ENA genes that are absent in S. albus J1074 and S. pratensis ATCC 33331. This ENA gene pool included functional categories of genes that are likely to be related to niche adaptations and which could be grouped based on potential biological functions such as osmotic stress, defense; transcriptional regulation; symbiotic interactions; antimicrobial compound production and resistance; ABC transporters; together with horizontal gene transfer and defense-related features.

April 21, 2020

Mobilome of Brevibacterium aurantiacum Sheds Light on Its Genetic Diversity and Its Adaptation to Smear-Ripened Cheeses.

Brevibacterium aurantiacum is an actinobacterium that confers key organoleptic properties to washed-rind cheeses during the ripening process. Although this industrially relevant species has been gaining an increasing attention in the past years, its genome plasticity is still understudied due to the unavailability of complete genomic sequences. To add insights on the mobilome of this group, we sequenced the complete genomes of five dairy Brevibacterium strains and one non-dairy strain using PacBio RSII. We performed phylogenetic and pan-genome analyses, including comparisons with other publicly available Brevibacterium genomic sequences. Our phylogenetic analysis revealed that these five dairy strains, previously identified as Brevibacterium linens, belong instead to the B. aurantiacum species. A high number of transposases and integrases were observed in the Brevibacterium spp. strains. In addition, we identified 14 and 12 new insertion sequences (IS) in B. aurantiacum and B. linens genomes, respectively. Several stretches of homologous DNA sequences were also found between B. aurantiacum and other cheese rind actinobacteria, suggesting horizontal gene transfer (HGT). A HGT region from an iRon Uptake/Siderophore Transport Island (RUSTI) and an iron uptake composite transposon were found in five B. aurantiacum genomes. These findings suggest that low iron availability in milk is a driving force in the adaptation of this bacterial species to this niche. Moreover, the exchange of iron uptake systems suggests cooperative evolution between cheese rind actinobacteria. We also demonstrated that the integrative and conjugative element BreLI (Brevibacterium Lanthipeptide Island) can excise from B. aurantiacum SMQ-1417 chromosome. Our comparative genomic analysis suggests that mobile genetic elements played an important role into the adaptation of B. aurantiacum to cheese ecosystems.

April 21, 2020

A Newly Isolated Bacillus subtilis Strain Named WS-1 Inhibited Diarrhea and Death Caused by Pathogenic Escherichia coli in Newborn Piglets.

Bacillus subtilis is recognized as a safe and reliable human and animal probiotic and is associated with bioactivities such as production of vitamin and immune stimulation. Additionally, it has great potential to be used as an alternative to antimicrobial drugs, which is significant in the context of antibiotic abuse in food animal production. In this study, we isolated one strain of B. subtilis, named WS-1, from apparently healthy pigs growing with sick cohorts on one Escherichia coli endemic commercial pig farm in Guangdong, China. WS-1 can strongly inhibit the growth of pathogenic E. coli in vitro. The B. subtilis strain WS-1 showed typical Bacillus characteristics by endospore staining, biochemical test, enzyme activity analysis, and 16S rRNA sequence analysis. Genomic analysis showed that the B. subtilis strain WS-1 shares 100% genomic synteny with B. subtilis with a size of 4,088,167 bp. Importantly, inoculation of newborn piglets with 1.5 × 1010 CFU of B. subtilis strain WS-1 by oral feeding was able to clearly inhibit diarrhea (p < 0.05) and death (p < 0.05) caused by pathogenic E. coli in piglets. Furthermore, histopathological results showed that the WS-1 strain could protect small intestine from lesions caused by E. coli infection. Collectively, these findings suggest that the probiotic B. subtilis strain WS-1 acts as a potential biocontrol agent protecting pigs from pathogenic E. coli infection. Importance: In this work, one B. subtilis strain (WS-1) was successfully isolated from apparently healthy pigs growing with sick cohorts on one E. coli endemic commercial pig farm in Guangdong, China. The B. subtilis strain WS-1 was identified to inhibit the growth of pathogenic E. coli both in vitro and in vivo, indicating its potential application in protecting newborn piglets from diarrhea caused by E. coli infections. The isolation and characterization will help better understand this bacterium, and the strain WS-1 can be further explored as an alternative to antimicrobial drugs to protect human and animal health.

April 21, 2020

A reference-grade wild soybean genome.

Efficient crop improvement depends on the application of accurate genetic information contained in diverse germplasm resources. Here we report a reference-grade genome of wild soybean accession W05, with a final assembled genome size of 1013.2?Mb and a contig N50 of 3.3?Mb. The analytical power of the W05 genome is demonstrated by several examples. First, we identify an inversion at the locus determining seed coat color during domestication. Second, a translocation event between chromosomes 11 and 13 of some genotypes is shown to interfere with the assignment of QTLs. Third, we find a region containing copy number variations of the Kunitz trypsin inhibitor (KTI) genes. Such findings illustrate the power of this assembly in the analysis of large structural variations in soybean germplasm collections. The wild soybean genome assembly has wide applications in comparative genomic and evolutionary studies, as well as in crop breeding and improvement programs.

April 21, 2020

Deep convolutional neural networks for accurate somatic mutation detection.

Accurate detection of somatic mutations is still a challenge in cancer analysis. Here we present NeuSomatic, the first convolutional neural network approach for somatic mutation detection, which significantly outperforms previous methods on different sequencing platforms, sequencing strategies, and tumor purities. NeuSomatic summarizes sequence alignments into small matrices and incorporates more than a hundred features to capture mutation signals effectively. It can be used universally as a stand-alone somatic mutation detection method or with an ensemble of existing methods to achieve the highest accuracy.

April 21, 2020

Circular consensus sequencing with long reads.

Long-read sequencing technologies have advantages in genome assembly, structural variant detection and haplotype phasing, but are less suited for single-nucleotide variant (SNV) and insertion/deletion (indel) calling due to the high error rate in comparison with short-read sequencing. Wenger et al., from Pacific Biosciences, optimized the circular consensus sequencing (CCS) protocol to achieve long, high-fidelity reads, in which they selected the SMRTbell library with fractions tightly distributed at 15 kb for high-coverage sequencing.

April 21, 2020

Sequence properties of certain GC rich avian genes, their origins and absence from genome assemblies: case studies.

More and more eukaryotic genomes are sequenced and assembled, most of them presented as a complete model in which missing chromosomal regions are filled by Ns and where a few chromosomes may be lacking. Avian genomes often contain sequences with high GC content, which has been hypothesized to be at the origin of many missing sequences in these genomes. We investigated features of these missing sequences to discover why some may not have been integrated into genomic libraries and/or sequenced.The sequences of five red jungle fowl cDNA models with high GC content were used as queries to search publicly available datasets of Illumina and Pacbio sequencing reads. These were used to reconstruct the leptin, TNFa, MRPL52, PCP2 and PET100 genes, all of which are absent from the red jungle fowl genome model. These gene sequences displayed elevated GC contents, had intron sizes that were sometimes larger than non-avian orthologues, and had non-coding regions that contained numerous tandem and inverted repeat sequences with motifs able to assemble into stable G-quadruplexes and intrastrand dyadic structures. Our results suggest that Illumina technology was unable to sequence the non-coding regions of these genes. On the other hand, PacBio technology was able to sequence these regions, but with dramatically lower efficiency than would typically be expected.High GC content was not the principal reason why numerous GC-rich regions of avian genomes are missing from genome assembly models. Instead, it is the presence of tandem repeats containing motifs capable of assembling into very stable secondary structures that is likely responsible.

April 21, 2020

The First Highly Contiguous Genome Assembly of Pikeperch (Sander lucioperca), an Emerging Aquaculture Species in Europe

The pikeperch (Sander lucioperca) is a fresh and brackish water Percid fish natively inhabiting the northern hemisphere. This species is emerging as a promising candidate for intensive aquaculture production in Europe. Specific traits like cannibalism, growth rate and meat quality require genomics based understanding, for an optimal husbandry and domestication process. Still, the aquaculture community is lacking an annotated genome sequence to facilitate genome-wide studies on pikeperch. Here, we report the first highly contiguous draft genome assembly of Sander lucioperca. In total, 413 and 66 giga base pairs of DNA sequencing raw data were generated with the Illumina platform and PacBio Sequel System, respectively. The PacBio data were assembled into a final assembly size of ~900 Mb covering 89% of the 1,014 Mb estimated genome size. The draft genome consisted of 1966 contigs ordered into 1,313 scaffolds. The contig and scaffold N50 lengths are 3.0 Mb and 4.9 Mb, respectively. The identified repetitive structures accounted for 39% of the genome. We utilized homologies to other ray-finned fishes, and ab initio gene prediction methods to predict 21,249 protein-coding genes in the Sander lucioperca genome, of which 88% were functionally annotated by either sequence homology or protein domains and signatures search. The assembled genome spans 97.6% and 96.3% of Vertebrate and Actinopterygii single-copy orthologs, respectively. The outstanding mapping rate (99.9%) of genomic PE-reads on the assembly suggests an accurate and nearly complete genome reconstruction. This draft genome sequence is the first genomic resource for this promising aquaculture species. It will provide an impetus for genomic-based breeding studies targeting phenotypic and performance traits of captive pikeperch.

April 21, 2020

The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome

Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes containing highly similar transcripts and transcript-spliced isoforms. Here, we analyzed the transcriptome of sugarcane, with a highly polyploidy plant genome, by PacBio isoform sequencing (Iso-Seq) of two different cDNA library preparations, with and without a normalization step. The results demonstrated that, while the two libraries included many of the same transcripts, many longer transcripts were removed and many new generally shorter transcripts were detected by normalization. For the same input cDNA and the same data yield, the normalized library recovered more total transcript isoforms, number of predicted gene families and orthologous groups, resulting in a higher representation for the sugarcane transcriptome, compared to the non-normalized library. The non-normalized library, on the other hand, included a wider transcript length range with more longer transcripts above ~1.25 kb, more transcript isoforms per gene family and gene ontology terms per transcript. A large proportion of the unique transcripts comprising ~52% of the normalized library were expressed at a lower level than the unique transcripts from the non-normalized library, across three tissue types tested including leaf, stalk and root. About 83% of the total 5,348 predicted long noncoding transcripts was derived from the normalized library, of which ~80% was derived from the lowly expressed fraction. Functional annotation of the unique transcripts suggested that each library enriched different functional transcript fractions. This demonstrated the complementation of the two approaches in obtaining a complete transcriptome of a complex genome at the sequencing depth used in this study.

April 21, 2020

Whole genome sequence and de novo assembly revealed genomic architecture of Indian Mithun (Bos frontalis).

Mithun (Bos frontalis), also called gayal, is an endangered bovine species, under the tribe bovini with 2n?=?58 XX chromosome complements and reared under the tropical rain forests region of India, China, Myanmar, Bhutan and Bangladesh. However, the origin of this species is still disputed and information on its genomic architecture is scanty so far. We trust that availability of its whole genome sequence data and assembly will greatly solve this problem and help to generate many information including phylogenetic status of mithun. Recently, the first genome assembly of gayal, mithun of Chinese origin, was published. However, an improved reference genome assembly would still benefit in understanding genetic variation in mithun populations reared under diverse geographical locations and for building a superior consensus assembly. We, therefore, performed deep sequencing of the genome of an adult female mithun from India, assembled and annotated its genome and performed extensive bioinformatic analyses to produce a superior de novo genome assembly of mithun.We generated ˜300 Gigabyte (Gb) raw reads from whole-genome deep sequencing platforms and assembled the sequence data using a hybrid assembly strategy to create a high quality de novo assembly of mithun with 96% recovered as per BUSCO analysis. The final genome assembly has a total length of 3.0 Gb, contains 5,015 scaffolds with an N50 value of 1?Mb. Repeat sequences constitute around 43.66% of the assembly. The genomic alignments between mithun to cattle showed that their genomes, as expected, are highly conserved. Gene annotation identified 28,044 protein-coding genes presented in mithun genome. The gene orthologous groups of mithun showed a high degree of similarity in comparison with other species, while fewer mithun specific coding sequences were found compared to those in cattle.Here we presented the first de novo draft genome assembly of Indian mithun having better coverage, less fragmented, better annotated, and constitutes a reasonably complete assembly compared to the previously published gayal genome. This comprehensive assembly unravelled the genomic architecture of mithun to a great extent and will provide a reference genome assembly to research community to elucidate the evolutionary history of mithun across its distinct geographical locations.

April 21, 2020

Chromosome-Level Alpaca Reference Genome VicPac3.1 Improves Genomic Insight Into the Biology of New World Camelids.

The development of high-quality chromosomally assigned reference genomes constitutes a key feature for understanding genome architecture of a species and is critical for the discovery of the genetic blueprints of traits of biological significance. South American camelids serve people in extreme environments and are important fiber and companion animals worldwide. Despite this, the alpaca reference genome lags far behind those available for other domestic species. Here we produced a chromosome-level improved reference assembly for the alpaca genome using the DNA of the same female Huacaya alpaca as in previous assemblies. We generated 190X Illumina short-read, 8X Pacific Biosciences long-read and 60X Dovetail Chicago® chromatin interaction scaffolding data for the assembly, used testis and skin RNAseq data for annotation, and cytogenetic map data for chromosomal assignments. The new assembly VicPac3.1 contains 90% of the alpaca genome in just 103 scaffolds and 76% of all scaffolds are mapped to the 36 pairs of the alpaca autosomes and the X chromosome. Preliminary annotation of the assembly predicted 22,462 coding genes and 29,337 isoforms. Comparative analysis of selected regions of the alpaca genome, such as the major histocompatibility complex (MHC), the region involved in the Minute Chromosome Syndrome (MCS) and candidate genes for high-altitude adaptations, reveal unique features of the alpaca genome. The alpaca reference genome VicPac3.1 presents a significant improvement in completeness, contiguity and accuracy over VicPac2 and is an important tool for the advancement of genomics research in all New World camelids.

Auto Tag: Long read sequencing

Genomic Analyses Reveal Evidence of Independent Evolution, Demographic History, and Extreme Environment Adaptation of Tibetan Plateau Agaricus bisporus.

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

Occurrence and Characterization of mcr-1-Positive Escherichia coli Isolated From Food-Producing Animals in Poland, 2011-2016.

Efomycins K and L From a Termite-Associated Streptomyces sp. M56 and Their Putative Biosynthetic Origin.

Comparative Genomics of Marine Sponge-Derived Streptomyces spp. Isolates SM17 and SM18 With Their Closest Terrestrial Relatives Provides Novel Insights Into Environmental Niche Adaptations and Secondary Metabolite Biosynthesis Potential.

Mobilome of Brevibacterium aurantiacum Sheds Light on Its Genetic Diversity and Its Adaptation to Smear-Ripened Cheeses.

A Newly Isolated Bacillus subtilis Strain Named WS-1 Inhibited Diarrhea and Death Caused by Pathogenic Escherichia coli in Newborn Piglets.

A reference-grade wild soybean genome.

Deep convolutional neural networks for accurate somatic mutation detection.

Circular consensus sequencing with long reads.

Sequence properties of certain GC rich avian genes, their origins and absence from genome assemblies: case studies.

The First Highly Contiguous Genome Assembly of Pikeperch (Sander lucioperca), an Emerging Aquaculture Species in Europe

Whole genome sequence and de novo assembly revealed genomic architecture of Indian Mithun (Bos frontalis).

Chromosome-Level Alpaca Reference Genome VicPac3.1 Improves Genomic Insight Into the Biology of New World Camelids.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert