Genome assembly Archives - Page 195 of 196

July 7, 2019

Complete genome sequence of Sulfitobacter sp. strain D7, a virulent bacterium isolated from an Emiliania huxleyi algal bloom in the North Atlantic.

A Rhodobacterales bacterium, Sulfitobacter sp. strain D7, was isolated from an Emiliania huxleyi bloom in the North Atlantic and has been shown to act as a pathogen and induce cell death of E. huxleyi during lab coculturing. We report here its complete genome sequence comprising one chromosome and five low-copy-number plasmids.

July 7, 2019

First complete genome sequence of Salmonella enterica subsp. enterica serovar Worthington strain CFSAN051295, isolated from pistachio.

We report here, using third-generation, single-molecule, real-time DNA sequencing, the first complete genome sequence of Salmonella enterica serovar Worthington CFSAN051295, isolated from pistachios in the United States. The genome consists of a single 4.9-Mb chromosome.

July 7, 2019

Pilot satellitome analysis of the model plant, Physcomitrellapatens, revealed a transcribed and high-copy IGS related tandem repeat.

Satellite DNA (satDNA) constitutes a substantial part of eukaryotic genomes. In the last decade, it has been shown that satDNA is not an inert part of the genome and its function extends beyond the nuclear membrane. However, the number of model plant species suitable for studying the novel horizons of satDNA functionality is low. Here, we explored the satellitome of the model “basal” plant, Physcomitrellapatens (Hedwig, 1801) Bruch & Schimper, 1849 (moss), which has a number of advantages for deep functional and evolutionary research. Using a newly developed pyTanFinder pipeline (https://github.com/Kirovez/pyTanFinder) coupled with fluorescence in situ hybridization (FISH), we identified five high copy number tandem repeats (TRs) occupying a long DNA array in the moss genome. The nuclear organization study revealed that two TRs had distinct locations in the moss genome, concentrating in the heterochromatin and knob-rDNA like chromatin bodies. Further genomic, epigenetic and transcriptomic analysis showed that one TR, named PpNATR76, was located in the intergenic spacer (IGS) region and transcribed into long non-coding RNAs (lncRNAs). Several specific features of PpNATR76 lncRNAs make them very similar with the recently discovered human lncRNAs, raising a number of questions for future studies. This work provides new resources for functional studies of satellitome in plants using the model organism P.patens, and describes a list of tandem repeats for further analysis.

July 7, 2019

BELLA: Berkeley Efficient Long-Read to Long-Read Aligner and Overlapper

De novo assembly is the process of reconstructing genomes from DNA fragments (reads), which may contain redundancy and errors. Longer reads simplify assembly and improve contiguity of the output, but current long-read technologies come with high error rates. A crucial step of de novo genome assembly for long reads consists of finding overlapping reads. We present Berkeley Long-Read to Long-Read Aligner and Overlapper (BELLA), which implement a novel approach to compute overlaps using Sparse Generalized Matrix Multiplication (SpGEMM). We present a probabilistic model which demonstrates the soundness of using short, fixed length k-mers to detect overlaps, avoiding expensive pairwise alignment of all reads against all others. We then introduce a notion of reliable k-mers based on our probabilistic model. The use of reliable k-mers eliminates both the k-mer set explosion that would otherwise happen with highly erroneous reads and the spurious overlaps due to k-mers originating from repetitive regions. Finally, we present a new method to separate true alignments from false positives depending on the alignment score. Using this methodology, which is employed in BELLAtextquoterights precise mode, the probability of false positives drops exponentially as the length of overlap between sequences increases. On simulated data, BELLA achieves an average of 2.26% higher recall than state-of-the-art tools in its sensitive mode and 18.90% higher precision than state-of-the-art tools in its precise mode, while being performance competitive.

July 7, 2019

Long-read-based genome sequences of pandemic and environmental Vibrio cholerae strains.

The bacterium Vibrio cholerae exhibits two distinct lifestyles, one as an aquatic bacterium and the other as the etiological agent of the pandemic human disease cholera. Here, we report closed genome sequences of two seventh pandemic V. cholerae O1 El Tor strains, A1552 and N16961, and the environmental strain Sa5Y.

July 7, 2019

Complete genome sequence of the polymyxin E (colistin)-producing Paenibacillus sp. strain B-LR.

Paenibacillus bacteria are recovered from varied niches, including human lung, rhizosphere, marine sediments, and hemolymph. Paenibacilli can have plant growth-promoting activities and be antibiotic producers. They can produce exopolysaccharides and enzymes of industrial interest. Illumina and PacBio reads were used to produce a complete genome sequence of the colistin producer Paenibacillus sp. strain B-LR.

July 7, 2019

Improved assembly of reference genome Fusarium oxysporum f. sp. lycopersici strain Fol4287.

Fusarium oxysporum is a pathogenic fungus that infects hundreds of plant species. This paper reports the improved genome assembly of a reference strain, F. oxysporum f. sp. lycopersici Fol4287, a tomato pathogen.

July 7, 2019

Draft genome sequence of Olsenella sp. KGMB 04489 isolated from healthy Korean human feces

The genus of Olsenella has been isolated from vertebrate animal mouth, rumen, and feces. Olsenella sp. KGMB 04489 was isolated from fecal samples obtained from a healthy Korean. The whole-genome sequence of Olsenella sp. KGMB 04489 was analyzed using the PacBio Sequel platform. The genome comprises a 2,108,034 bp chromosome with a G + C content of 65.50%, 1,838 total genes, 13 rRNA genes, and 52 tRNA genes. Also, we found that strain KGMB 04489 had some genes for hydrolysis enzymes, and antibiotic biosynthesis and resistance in its genome based on the result of genome analysis.

July 7, 2019

Complete genome of the multidrug-resistant Escherichia coli strain KBN10P04869 isolated from a patient with acute myeloid leukemia

Recently, we isolated a multidrug-resistant Escherichia coli strain KBN10P04869 from a patient with acute myeloid leukemia. We report the complete genome of this strain which consists of 5,104,264 bp with 4,457 protein-coding genes, 88 tRNAs, and 22 rRNAs, and the co-occurrence of multidrug- resistant genes including bla CMY-2, bla TEM-1, bla CTX-M-15, bla NDM-5, and blaOXA-18.

July 7, 2019

Bridging gaps in transposable element research with single-molecule and single-cell technologies

More than half of the genomic landscape in humans and many other organisms is composed of repetitive DNA, which mostly derives from transposable elements (TEs) and viruses. Recent technological advances permit improved assessment of the repetitive content across genomes and newly developed molecular assays have revealed important roles of TEs and viruses in host genome evolution and organization. To update on our current understanding of TE biology and to promote new interdisciplinary strategies for the TE research community, leading experts gathered for the 2nd Uppsala Transposon Symposium on October 4–5, 2018 in Uppsala, Sweden. Using cutting-edge single-molecule and single-cell approaches, research on TEs and other repeats has entered a new era in biological and biomedical research.

July 7, 2019

Hardwood tree genomics: Unlocking woody plant biology.

Woody perennial angiosperms (i.e., hardwood trees) are polyphyletic in origin and occur in most angiosperm orders. Despite their independent origins, hardwoods have shared physiological, anatomical, and life history traits distinct from their herbaceous relatives. New high-throughput DNA sequencing platforms have provided access to numerous woody plant genomes beyond the early reference genomes of Populus and Eucalyptus, references that now include willow and oak, with pecan and chestnut soon to follow. Genomic studies within these diverse and undomesticated species have successfully linked genes to ecological, physiological, and developmental traits directly. Moreover, comparative genomic approaches are providing insights into speciation events while large-scale DNA resequencing of native collections is identifying population-level genetic diversity responsible for variation in key woody plant biology across and within species. Current research is focused on developing genomic prediction models for breeding, defining speciation and local adaptation, detecting and characterizing somatic mutations, revealing the mechanisms of gender determination and flowering, and application of systems biology approaches to model complex regulatory networks underlying quantitative traits. Emerging technologies such as single-molecule, long-read sequencing is being employed as additional woody plant species, and genotypes within species, are sequenced, thus enabling a comparative (“evo-devo”) approach to understanding the unique biology of large woody plants. Resource availability, current genomic and genetic applications, new discoveries and predicted future developments are illustrated and discussed for poplar, eucalyptus, willow, oak, chestnut, and pecan.

July 7, 2019

Complete Closed Genome Sequences of Three Salmonella enterica subsp. enterica Serovar Dublin Strains Isolated from Cattle at Harvest.

Salmonella enterica subsp. enterica serovar Dublin is a host-adapted pathogen for cattle that can cause invasive disease in humans. To facilitate genomic comparisons characterizing virulence determinants of this pathogen, we present the complete genome sequences of three S. Dublin strains isolated from bovine sources at harvest.

July 7, 2019

Complete Genome Sequence of the Industrial Fast-Acidifying Strain Streptococcus thermophilus N4L.

Streptococcus thermophilus is one of the most used dairy starters for the production of yogurt and cheese. We report here the complete genome sequence of the industrial strain S. thermophilus N4L, which is used in dairy technology for its fast-acidifying phenotype.

July 7, 2019

De novo genome assembly of the olive fruit fly (Bactrocera oleae) developed through a combination of linked-reads and long-read technologies

Long-read sequencing has greatly contributed to the generation of high quality assemblies, albeit at a high cost. It is also not always clear how to combine sequencing platforms. We sequenced the genome of the olive fruit fly (Bactrocera oleae), the most important pest in the olive fruits agribusiness industry, using Illumina short-reads, mate-pairs, 10x Genomics linked-reads, Pacific Biosciences (PacBio), and Oxford Nanopore Technologies (ONT). The 10x linked-reads assembly gave the most contiguous assembly with an N50 of 2.16 Mb. Scaffolding the linked-reads assembly using long-reads from ONT gave a more contiguous assembly with scaffold N50 of 4.59 Mb. We also present the most extensive transcriptome datasets of the olive fly derived from different tissues and stages of development. Finally, we used the Chromosome Quotient method to identify Y-chromosome scaffolds and show that the long-reads based assembly generates very highly contiguous Y-chromosome assembly.

July 7, 2019

Whole-Genome and Expression Analyses of Bamboo Aquaporin Genes Reveal Their Functions Involved in Maintaining Diurnal Water Balance in Bamboo Shoots.

Water supply is essential for maintaining normal physiological function during the rapid growth of bamboo. Aquaporins (AQPs) play crucial roles in water transport for plant growth and development. Although 26 PeAQPs in bamboo have been reported, the aquaporin-led mechanism of maintaining diurnal water balance in bamboo shoots remains unclear. In this study, a total of 63 PeAQPs were identified, based on the updated genome of moso bamboo (Phyllostachys edulis), including 22 PePIPs, 20 PeTIPs, 17 PeNIPs, and 4 PeSIPs. All of the PeAQPs were differently expressed in 26 different tissues of moso bamboo, based on RNA sequencing (RNA-seq) data. The root pressure in shoots showed circadian rhythm changes, with positive values at night and negative values in the daytime. The quantitative real-time PCR (qRT-PCR) result showed that 25 PeAQPs were detected in the base part of the shoots, and most of them demonstrated diurnal rhythm changes. The expression levels of some PeAQPs were significantly correlated with the root pressure. Of the 86 sugar transport genes, 33 had positive co-expression relationships with 27 PeAQPs. Two root pressure-correlated PeAQPs, PeTIP4;1 and PeTIP4;2, were confirmed to be highly expressed in the parenchyma and epidermal cells of bamboo culm, and in the epidermis, pith, and primary xylem of bamboo roots by in situ hybridization. The authors’ findings provide new insights and a possible aquaporin-led mechanism for bamboo fast growth.

Auto Tag: Genome assembly

Complete genome sequence of Sulfitobacter sp. strain D7, a virulent bacterium isolated from an Emiliania huxleyi algal bloom in the North Atlantic.

First complete genome sequence of Salmonella enterica subsp. enterica serovar Worthington strain CFSAN051295, isolated from pistachio.

Pilot satellitome analysis of the model plant, Physcomitrellapatens, revealed a transcribed and high-copy IGS related tandem repeat.

BELLA: Berkeley Efficient Long-Read to Long-Read Aligner and Overlapper

Long-read-based genome sequences of pandemic and environmental Vibrio cholerae strains.

Complete genome sequence of the polymyxin E (colistin)-producing Paenibacillus sp. strain B-LR.

Improved assembly of reference genome Fusarium oxysporum f. sp. lycopersici strain Fol4287.

Draft genome sequence of Olsenella sp. KGMB 04489 isolated from healthy Korean human feces

Complete genome of the multidrug-resistant Escherichia coli strain KBN10P04869 isolated from a patient with acute myeloid leukemia

Bridging gaps in transposable element research with single-molecule and single-cell technologies

Hardwood tree genomics: Unlocking woody plant biology.

Complete Closed Genome Sequences of Three Salmonella enterica subsp. enterica Serovar Dublin Strains Isolated from Cattle at Harvest.

Complete Genome Sequence of the Industrial Fast-Acidifying Strain Streptococcus thermophilus N4L.

De novo genome assembly of the olive fruit fly (Bactrocera oleae) developed through a combination of linked-reads and long-read technologies

Whole-Genome and Expression Analyses of Bamboo Aquaporin Genes Reveal Their Functions Involved in Maintaining Diurnal Water Balance in Bamboo Shoots.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert