Biotechnology Archives - Page 170 of 183

July 7, 2019

Complete genome sequence of a new halophilic archaeon, Haloarcula taiwanensis, isolated from a solar saltern in southern Taiwan.

We report here the completion of the genome sequence of a new species of haloarchaea, Haloarcula taiwanensis, isolated in southern Taiwan. The 3,721,706-bp genome consisted of chromosome I (2,966,258 bp, 63.6% GC content), chromosome II (525,233 bp, 59.6% GC content), plasmid pNYT1 (129,893 bp, 55.3% GC content), and plasmid pNYT2 (100,322 bp, 55.7% GC content).

July 7, 2019

Complete genome sequence of Pseudomonas sp. strain NC02, isolated from soil.

We report here the complete genome sequence of Pseudomonas sp. strain NC02, isolated from soil in eastern Massachusetts. We assembled PacBio reads into a single closed contig with 132× mean coverage and then polished this contig using Illumina MiSeq reads, yielding a 6,890,566-bp sequence with 61.1% GC content. Copyright © 2018 Cerra et al.

July 7, 2019

Complete genome sequence of Dietziasp. Strain WMMA184, a marine coral-associated bacterium.

Dietzia sp. strain WMMA184 was isolated from the marine coralMontastraea faveolataas part of ongoing drug discovery efforts. Analysis of the 4.16-Mb genome provides information regarding interspecies interactions as it pertains to the regulation of secondary metabolism and natural product biosynthesis potential. Copyright © 2018 Braun et al.

July 7, 2019

Complete genome sequence of Thermoanaerobacterium sp. strain RBIITD, a butyrate- and butanol-producing thermophile.

Thermoanaerobacterium sp. strain RBIITD was isolated from contaminated rich growth medium at 55°C in an anaerobic chamber. It primarily produces butyrate as a fermentation product from plant biomass-derived sugars. The whole-genome sequence of the strain is 3.4 Mbp, with 3,444 genes and 32.48% GC content.

July 7, 2019

Complete genome sequence and methylome analysis of Bacillus caldolyticus NEB414.

Bacillus caldolyticus NEB414 is the original source strain for the restriction enzyme BclI. Its complete sequence and full methylome were determined using single-molecule real-time sequencing. Copyright © 2018 Fomenkov et al.

July 7, 2019

Complete genome sequence of industrial dairy strain Streptococcus thermophilus DGCC 7710.

We report here the complete genome sequence of Streptococcus thermophilus DGCC 7710. S. thermophilus is widely used in industrial dairy production.

July 7, 2019

Ten steps to get started in Genome Assembly and Annotation.

As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR).

July 7, 2019

FMLRC: Hybrid long read error correction using an FM-index.

Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging “hybrid” assemblies that use long reads for scaffolding and short reads for accuracy.We describe a novel method leveraging a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We demonstrate that our method efficiently produces significantly more high quality corrected sequence than existing hybrid error-correction methods. We also show that our method produces more contiguous assemblies, in many cases, than existing state-of-the-art hybrid and long-read only de novo assembly methods.Our method accurately corrects long read sequence data using complementary short reads. We demonstrate higher total throughput of corrected long reads and a corresponding increase in contiguity of the resulting de novo assemblies. Improved throughput and computational efficiency than existing methods will help better economically utilize emerging long read sequencing technologies.

July 7, 2019

Identification and expression analysis of wheat TaGF14 genes.

The 14-3-3 gene family members play key roles in various cellular processes. However, little is known about the numbers and roles of 14-3-3 genes in wheat. The aims of this study were to identify TaGF14 numbers in wheat by searching its whole genome through blast, to study the phylogenetic relationships with other plant species and to discuss the functions of TaGF14s. The results showed that common wheat harbored 20 TaGF14 genes, located on wheat chromosome groups 2, 3, 4, and 7. Out of them, eighteen TaGF14s are non-e proteins, and two wheat TaGF14 genes, TaGF14i and TaGF14f, are e proteins. Phylogenetic analysis indicated that these genes were divided into six clusters: cluster 1 (TaGF14d, TaGF14g, TaGF14j, TaGF14h, TaGF14c, and TaGF14n); cluster 2 (TaGF14k); cluster 3 (TaGF14b, TaGF14l, TaGF14m, and TaGF14s); cluster 4 (TaGF14a, TaGF14e, and TaGF14r); cluster 5 (TaGF14i and TaGF14f); and cluster 6 (TaGF14o, TaGF14p, TaGF14q, and TaGF14t). Tissue-specific gene expressions suggested that all TaGF14s were likely constitutively expressed, except two genes, i.e., TaGF14p and TaGF14f. And the highest amount of TaGF14 transcripts were observed in developing grains at 20 days post anthesis (DPA), especially for TaGF14j and TaGF14l. After drought stress, five genes, i.e., TaGF14c, TaGF14d, TaGF14g, TaGF14h, and TaGF14j, were up-regulated expression under drought stress for both 1 and 6 h, suggesting these genes played vital role in combating against drought stress. However, all the TaGF14s were down-regulated expression under heat stress for both 1 and 6 h, indicating TaGF14s may be negatively associated with heat stress by reducing the expression to combat heat stress or through other pathways. These results suggested that cluster 1, e.g., TaGF14j, may participate in the whole wheat developing stages, e.g., grain-filling (starch biosynthesis) and may also participate in combating against drought stress. Subsequently, a homolog of TaGF14j, TaGF14-JM22, were cloned by RACE and used to validate its function. Immunoblotting results showed that TaGF14-JM22 protein, closely related to TaGF14d, TaGF14g, and TaGF14j, can interact with AGP-L, SSI, SSII, SBEIIa, and SBEIIb in developing grains, suggesting that TaGF14s located on group 4 may be involved in starch biosynthesis. Therefore, it is possible to develop starch-rich wheat cultivars by modifying TaGF14s.

July 7, 2019

Complete genome sequence of uropathogenic Escherichia coli isolate UPEC 26-1.

Urinary tract infections (UTIs) are among the most common infections in humans, predominantly caused by uropathogenic Escherichia coli (UPEC). The diverse genomes of UPEC strains mostly impede disease prevention and control measures. In this study, we comparatively analyzed the whole genome sequence of a highly virulent UPEC strain, namely UPEC 26-1, which was isolated from urine sample of a patient suffering from UTI in Korea. Whole genome analysis showed that the genome consists of one circular chromosome of 5,329,753 bp, comprising 5064 protein-coding genes, 122 RNA genes (94 tRNA, 22 rRNA and 6 ncRNA genes), and 100 pseudogenes, with an average G+C content of 50.56%. In addition, we identified 8 prophage regions comprising 5 intact, 2 incomplete and 1 questionable ones and 63 genomic islands, suggesting the possibility of horizontal gene transfer in this strain. Comparative genome analysis of UPEC 26-1 with the UPEC strain CFT073 revealed an average nucleotide identity of 99.7%. The genome comparison with CFT073 provides major differences in the genome of UPEC 26-1 that would explain its increased virulence and biofilm formation. Nineteen of the total GIs were unique to UPEC 26-1 compared to CFT073 and nine of them harbored unique genes that are involved in virulence, multidrug resistance, biofilm formation and bacterial pathogenesis. The data from this study will assist in future studies of UPEC strains to develop effective control measures.

July 7, 2019

scanPAV: a pipeline for extracting presence-absence variations in genome pairs.

The recent technological advances in genome sequencing techniques have resulted in an exponential increase in the number of sequenced human and non-human genomes. The ever increasing number of assemblies generated by novel de novo pipelines and strategies demands the development of new software to evaluate assembly quality and completeness. One way to determine the completeness of an assembly is by detecting its Presence-Absence variations (PAV) with respect to a reference, where PAVs between two assemblies are defined as the sequences present in one assembly but entirely missing in the other one. Beyond assembly error or technology bias, PAVs can also reveal real genome polymorphism, consequence of species or individual evolution, or horizontal transfer from viruses and bacteria.We present scanPAV, a pipeline for pairwise assembly comparison to identify and extract sequences present in one assembly but not the other. In this note, we use the GRCh38 reference assembly to assess the completeness of six human genome assemblies from various assembly strategies and sequencing technologies including Illumina short reads, 10× genomics linked-reads, PacBio and Oxford Nanopore long reads, and Bionano optical maps. We also discuss the PAV polymorphism of seven Tasmanian devil whole genome assemblies of normal animal tissues and devil facial tumour 1 (DFT1) and 2 (DFT2) samples, and the identification of bacterial sequences as contamination in some of the tumorous assemblies.The pipeline is available under the MIT License at https://github.com/wtsi-hpag/scanPAV.Supplementary data are available at Bioinformatics online.

July 7, 2019

The ‘gifted’ actinomycete Streptomyces leeuwenhoekii.

Streptomyces leeuwenhoekii strains C34T, C38, C58 and C79 were isolated from a soil sample collected from the Chaxa Lagoon, located in the Salar de Atacama in northern Chile. These streptomycetes produce a variety of new specialised metabolites with antibiotic, anti-cancer and anti-inflammatory activities. Moreover, genome mining performed on two of these strains has revealed the presence of biosynthetic gene clusters with the potential to produce new specialised metabolites. This review focusses on this new clade of Streptomyces strains, summarises the literature and presents new information on strain C34T.

July 7, 2019

Natural rubber and the Russian dandelion genome

The world needs rubber. Rubber is crucial for the tires on the cars, trucks and airplanes that propel modern transportation. It is equally important for daily tasks: latex gloves in the lab, balloons in angioplasty and wetsuits that warm a cold dip in the ocean. Rubber can be made synthetically from petroleum derivatives, but synthetic rubber is not as strong as rubber iso- lated from plants. The principal plant source for natural rubber (NR) is the sap of the Par´ a tree (Hevea brasiliensis), which is grown throughout Southeast Asia. Unfortunately, the produc- tion capacity of the Par´ a tree is limited by the availability of suitable land and by labor-intensive harvesting methods. The sustainability of the Par´ a crop is also constrained by its narrow genetic base, which may make the crop susceptible to disease.

July 7, 2019

Smooth q-Gram, and its applications to detection of overlaps among long, error-prone sequencing reads

We propose smoothq-gram, the frst variant of q-gram that captures q-gram pair within a small edit distance. We apply smooth q-gram to the problem of detecting overlapping pairs of error-prone reads produced by single molecule real time sequencing (SMRT), which is the frst and most critical step of the de novo fragment assembly of SMRT reads. We have implemented and tested our algorithm on a set of real world benchmarks. Our empirical results demonstrated the signifcant superiority of our algorithm over the existing q-gram based algorithms in accuracy.

July 7, 2019

Gapless genome assembly of the potato and tomato early blight pathogen Alternaria solani.

The Alternaria genus consists of saprophytic fungi as well as plant-pathogenic species that have significant economic impact. To date, the genomes of multiple Alternaria species have been sequenced. These studies have yielded valuable data for molecular studies on Alternaria fungi. However, most of the current Alternaria genome assemblies are highly fragmented, thereby hampering the identification of genes that are involved in causing disease. Here, we report a gapless genome assembly of A. solani, the causal agent of early blight in tomato and potato. The genome assembly is a significant step toward a better understanding of pathogenicity of A. solani.

Auto Tag: Biotechnology

Complete genome sequence of a new halophilic archaeon, Haloarcula taiwanensis, isolated from a solar saltern in southern Taiwan.

Complete genome sequence of Pseudomonas sp. strain NC02, isolated from soil.

Complete genome sequence of Dietziasp. Strain WMMA184, a marine coral-associated bacterium.

Complete genome sequence of Thermoanaerobacterium sp. strain RBIITD, a butyrate- and butanol-producing thermophile.

Complete genome sequence and methylome analysis of Bacillus caldolyticus NEB414.

Complete genome sequence of industrial dairy strain Streptococcus thermophilus DGCC 7710.

Ten steps to get started in Genome Assembly and Annotation.

FMLRC: Hybrid long read error correction using an FM-index.

Identification and expression analysis of wheat TaGF14 genes.

Complete genome sequence of uropathogenic Escherichia coli isolate UPEC 26-1.

scanPAV: a pipeline for extracting presence-absence variations in genome pairs.

The ‘gifted’ actinomycete Streptomyces leeuwenhoekii.

Natural rubber and the Russian dandelion genome

Smooth q-Gram, and its applications to detection of overlaps among long, error-prone sequencing reads

Gapless genome assembly of the potato and tomato early blight pathogen Alternaria solani.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert