Reference genome Archives - Page 58 of 64

September 22, 2019

Endogenous rRNA sequence variation can regulate stress response gene expression and phenotype.

Prevailing dogma holds that ribosomes are uniform in composition and function. Here, we show that nutrient limitation-induced stress in E. coli changes the relative expression of rDNA operons to alter the rRNA composition within the actively translating ribosome pool. The most upregulated operon encodes the unique 16S rRNA, rrsH, distinguished by conserved sequence variation within the small ribosomal subunit. rrsH-bearing ribosomes affect the expression of functionally coherent gene sets and alter the levels of the RpoS sigma factor, the master regulator of the general stress response. These impacts are associated with phenotypic changes in antibiotic sensitivity, biofilm formation, and cell motility and are regulated by stress response proteins, RelA and RelE, as well as the metabolic enzyme and virulence-associated protein, AdhE. These findings establish that endogenously encoded, naturally occurring rRNA sequence variation can modulate ribosome function, central aspects of gene expression regulation, and cellular physiology. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

September 22, 2019

How long are long tandem repeats? A challenge for current methods of whole-genome sequence assembly: The case of satellites in Caenorhabditis elegans.

Repetitive genome regions have been difficult to sequence, mainly because of the comparatively small size of the fragments used in assembly. Satellites or tandem repeats are very abundant in nematodes and offer an excellent playground to evaluate different assembly methods. Here, we compare the structure of satellites found in three different assemblies of the Caenorhabditis elegans genome: the original sequence obtained by Sanger sequencing, an assembly based on PacBio technology, and an assembly using Nanopore sequencing reads. In general, satellites were found in equivalent genomic regions, but the new long-read methods (PacBio and Nanopore) tended to result in longer assembled satellites. Important differences exist between the assemblies resulting from the two long-read technologies, such as the sizes of long satellites. Our results also suggest that the lengths of some annotated genes with internal repeats which were assembled using Sanger sequencing are likely to be incorrect.

September 22, 2019

Repeat elements organise 3D genome structure and mediate transcription in the filamentous fungus Epichloë festucae.

Structural features of genomes, including the three-dimensional arrangement of DNA in the nucleus, are increasingly seen as key contributors to the regulation of gene expression. However, studies on how genome structure and nuclear organisation influence transcription have so far been limited to a handful of model species. This narrow focus limits our ability to draw general conclusions about the ways in which three-dimensional structures are encoded, and to integrate information from three-dimensional data to address a broader gamut of biological questions. Here, we generate a complete and gapless genome sequence for the filamentous fungus, Epichloë festucae. We use Hi-C data to examine the three-dimensional organisation of the genome, and RNA-seq data to investigate how Epichloë genome structure contributes to the suite of transcriptional changes needed to maintain symbiotic relationships with the grass host. Our results reveal a genome in which very repeat-rich blocks of DNA with discrete boundaries are interspersed by gene-rich sequences that are almost repeat-free. In contrast to other species reported to date, the three-dimensional structure of the genome is anchored by these repeat blocks, which act to isolate transcription in neighbouring gene-rich regions. Genes that are differentially expressed in planta are enriched near the boundaries of these repeat-rich blocks, suggesting that their three-dimensional orientation partly encodes and regulates the symbiotic relationship formed by this organism.

September 22, 2019

Loss of bacitracin resistance due to a large genomic deletion among Bacillus anthracis strains.

Bacillus anthracis is a Gram-positive endospore-forming bacterial species that causes anthrax in both humans and animals. In Zambia, anthrax cases are frequently reported in both livestock and wildlife, with occasional transmission to humans, causing serious public health problems in the country. To understand the genetic diversity of B. anthracis strains in Zambia, we sequenced and compared the genomic DNA of B. anthracis strains isolated across the country. Single nucleotide polymorphisms clustered these strains into three groups. Genome sequence comparisons revealed a large deletion in strains belonging to one of the groups, possibly due to unequal crossing over between a pair of rRNA operons. The deleted genomic region included genes conferring resistance to bacitracin, and the strains with the deletion were confirmed with loss of bacitracin resistance. Similar deletions between rRNA operons were also observed in a few B. anthracis strains phylogenetically distant from Zambian strains. The structure of bacitracin resistance genes flanked by rRNA operons was conserved only in members of the Bacillus cereus group. The diversity and genomic characteristics of B. anthracis strains determined in this study would help in the development of genetic markers and treatment of anthrax in Zambia. IMPORTANCE Anthrax is caused by Bacillus anthracis, an endospore-forming soil bacterium. The genetic diversity of B. anthracis is known to be low compared with that of Bacillus species. In this study, we performed whole-genome sequencing of Zambian isolates of B. anthracis to understand the genetic diversity between closely related strains. Comparison of genomic sequences revealed that closely related strains were separated into three groups based on single nucleotide polymorphisms distributed throughout the genome. A large genomic deletion was detected in the region containing a bacitracin resistance gene cluster flanked by rRNA operons, resulting in the loss of bacitracin resistance. The structure of the deleted region, which was also conserved among species of the Bacillus cereus group, has the potential for both deletion and amplification and thus might be enabling the species to flexibly control the level of bacitracin resistance for adaptive evolution.

September 22, 2019

A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content

Cannabis has been cultivated for millennia with distinct cultivars providing either fiber and grain or tetrahydrocannabinol. Recent demand for cannabidiol rather than tetrahydrocannabinol has favored the breeding of admixed cultivars with extremely high cannabidiol content. Despite several draft Cannabis genomes, the genomic structure of cannabinoid synthase loci has remained elusive. A genetic map derived from a tetrahydrocannabinol/cannabidiol segregating population and a complete chromosome assembly from a high-cannabidiol cultivar together resolve the linkage of cannabidiolic and tetrahydrocannabinolic acid synthase gene clusters which are associated with transposable elements. High-cannabidiol cultivars appear to have been generated by integrating hemp-type cannabidiolic acid synthase gene clusters into a background of marijuana-type cannabis. Quantitative trait locus mapping suggests that overall drug potency, however, is associated with other genomic regions needing additional study.

September 22, 2019

SKA: Split Kmer Analysis Toolkit for Bacterial Genomic Epidemiology

Genome sequencing is revolutionising infectious disease epidemiology, providing a huge step forward in sensitivity and specificity over more traditional molecular typing techniques. However, the complexity of genome data often means that its analysis and interpretation requires high-performance compute infrastructure and dedicated bioinformatics support. Furthermore, current methods have limitations that can differ between analyses and are often opaque to the user, and their reliance on multiple external dependencies makes reproducibility difficult. Here I introduce SKA, a toolkit for analysis of genome sequence data from closely-related, small, haploid genomes. SKA uses split kmers to rapidly identify variation between genome sequences, making it possible to analyse hundreds of genomes on a standard home computer. Tests on publicly available simulated and real-life data show that SKA is both faster and more efficient than the gold standard methods used today while retaining similar levels of accuracy for epidemiological purposes. SKA can take raw read data or genome assemblies as input and calculate pairwise distances, create single linkage clusters and align genomes to a reference genome or using a reference-free approach. SKA requires few decisions to be made by the user, which, along with its computational efficiency, allows genome analysis to become accessible to those with only basic bioinformatics training. The limitations of SKA are also far more transparent than for current approaches, and future improvements to mitigate these limitations are possible. Overall, SKA is a powerful addition to the armoury of the genomic epidemiologist. SKA source code is available from Github (https://github.com/simonrharris/SKA).

September 22, 2019

Physiological genomics of dietary adaptation in a marine herbivorous fish

Adopting a new diet is a significant evolutionary change and can profoundly affect an animaltextquoterights physiology, biochemistry, ecology, and its genome. To study this evolutionary transition, we investigated the physiology and genomics of digestion of a derived herbivorous fish, the monkeyface prickleback (Cebidichthys violaceus). We sequenced and assembled its genome and digestive transcriptome and revealed the molecular changes related to important dietary enzymes, finding abundant evidence for adaptation at the molecular level. In this species, two gene families experienced expansion in copy number and adaptive amino acid substitutions. These families, amylase, and bile salt activated lipase, are involved digestion of carbohydrates and lipids, respectively. Both show elevated levels of gene expression and increased enzyme activity. Because carbohydrates are abundant in the pricklebacktextquoterights diet and lipids are rare, these findings suggest that such dietary specialization involves both exploiting abundant resources and scavenging rare ones, especially essential nutrients, like essential fatty acids.

September 22, 2019

Constant conflict between Gypsy LTR retrotransposons and CHH methylation within a stress-adapted mangrove genome.

The evolutionary dynamics of the conflict between transposable elements (TEs) and their host genome remain elusive. This conflict will be intense in stress-adapted plants as stress can often reactivate TEs. Mangroves reduce TE load convergently in their adaptation to intertidal environments and thus provide a unique opportunity to address the host-TE conflict and its interaction with stress adaptation. Using the mangrove Rhizophora apiculata as a model, we investigated methylation and short interfering RNA (siRNA) targeting patterns in relation to the abundance and age of long terminal repeat (LTR) retrotransposons. We also examined the distance of LTR retrotransposons to genes, the impact on neighboring gene expression and population frequencies. We found differential accumulation amongst classes of LTR retrotransposons despite high overall methylation levels. This can be attributed to 24-nucleotide siRNA-mediated CHH methylation preferentially targeting Gypsy elements, particularly in their LTR regions. Old Gypsy elements possess unusually abundant siRNAs which show cross-mapping to young copies. Gypsy elements appear to be closer to genes and under stronger purifying selection than other classes. Our results suggest a continuous host-TE battle masked by the TE load reduction in R. apiculata. This conflict may enable mangroves, such as R. apiculata, to maintain genetic diversity and thus evolutionary potential during stress adaptation.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.

September 22, 2019

Combining probabilistic alignments with read pair information improves accuracy of split-alignments.

Split-alignments provide base-pair-resolution evidence of genomic rearrangements. In practice, they are found by first computing high-scoring local alignments, parts of which are then combined into a split-alignment. This approach is challenging when aligning a short read to a large and repetitive reference, as it tends to produce many spurious local alignments leading to ambiguities in identifying the correct split-alignment. This problem is further exacerbated by the fact that rearrangements tend to occur in repeat-rich regions.We propose a split-alignment technique that combats the issue of ambiguous alignments by combining information from probabilistic alignment with positional information from paired-end reads. We demonstrate that our method finds accurate split-alignments, and that this translates into improved performance of variant-calling tools that rely on split-alignments.An open-source implementation is freely available at: https://bitbucket.org/splitpairedend/last-split-pe.Supplementary data are available at Bioinformatics online.

September 22, 2019

Prevalence, antimicrobial resistance and phylogenetic characterization of Yersinia enterocolitica in retail poultry meat and swine feces in parts of China

Yersinia enterocolitica is an enteropathogen transmitted by contaminated food. In this study, a total of 500 retail poultry meat samples from 4 provinces and 145 swine feces samples from 12 provinces in China was tested for Y. enterocolitica and 26 isolates were obtained for further bio-serotyping, testing with antimicrobial susceptibility testing to a panel of antimicrobial compounds, and genetically characterization based on the whole genome sequencing. Higher prevalence (4.8%) of Y. enterocolitica contamination in retail poultry meat than that in swine feces (2.76%) was observed. No difference in bio-serotypes, multilocus sequence typing (MLST) and virulence genes distribution between swine and poultry origin were found. All isolates were resistant to ampicillin, amoxicillin/clavulanic acid, and cefazolin and were multi-drug resistant (MDR). The most predominant drug-resistance profile was AMP-CFZ-AMC-FOX (42.31%). A pathogenic isolate with bio-serotype 3/O:3 and ST135 was cultured from retail fresh chicken meat for the first time in China. Based on the whole-genome single nucleotide polymorphisms (SNPs) tree analysis, pathogenic isolates clustered closely, while nonpathogenic isolates exhibited high genetic heterogeneity. These indicated that pathogenic isolates were conserved on genetic level. The whole-genome SNP tree also revealed that Y. enterocolitica of swine, chicken and duck origin may share a common ancestor. The findings highlight the emergence of drug-resistant pathogenic Y. entrocoliticas in retailed poultry meats in China.

September 22, 2019

The unique evolution of the pig LRC, a single KIR but expansion of LILR and a novel Ig receptor family.

The leukocyte receptor complex (LRC) encodes numerous immunoglobulin (Ig)-like receptors involved in innate immunity. These include the killer-cell Ig-like receptors (KIR) and the leukocyte Ig-like receptors (LILR) which can be polymorphic and vary greatly in number between species. Using the recent long-read genome assembly, Sscrofa11.1, we have characterized the porcine LRC on chromosome 6. We identified a ~?197-kb region containing numerous LILR genes that were missing in previous assemblies. Out of 17 such LILR genes and fragments, six encode functional proteins, of which three are inhibitory and three are activating, while the majority of pseudogenes had the potential to encode activating receptors. Elsewhere in the LRC, between FCAR and GP6, we identified a novel gene that encodes two Ig-like domains and a long inhibitory intracellular tail. Comparison with two other porcine assemblies revealed a second, nearly identical, non-functional gene encoding a short intracellular tail with ambiguous function. These novel genes were found in a diverse range of mammalian species, including a pseudogene in humans, and typically consist of a single long-tailed receptor and a variable number of short-tailed receptors. Using porcine transcriptome data, both the novel inhibitory gene and the LILR were highly expressed in peripheral blood, while the single KIR gene, KIR2DL1, was either very poorly expressed or not at all. These observations are a prerequisite for improved understanding of immune cell functions in the pig and other species.

September 22, 2019

Genomic analysis of multi-resistant Staphylococcus capitis associated with neonatal sepsis.

Coagulase-negative staphylococci (CoNS), such as Staphylococcus capitis, are major causes of bloodstream infections in neonatal intensive care units (NICUs). Recently, a distinct clone of S. capitis (designated S. capitis NRCS-A) has emerged as an important pathogen in NICUs internationally. Here, 122 S. capitis isolates from New Zealand (NZ) underwent whole-genome sequencing (WGS), and these data were supplemented with publicly available S. capitis sequence reads. Phylogenetic and comparative genomic analyses were performed, as were phenotypic assessments of antimicrobial resistance, biofilm formation, and plasmid segregational stability on representative isolates. A distinct lineage of S. capitis was identified in NZ associated with neonates and the NICU environment. Isolates from this lineage produced increased levels of biofilm, displayed higher levels of tolerance to chlorhexidine, and were multidrug resistant. Although similar to globally circulating NICU-associated S. capitis strains at a core-genome level, NZ NICU S. capitis isolates carried a novel stably maintained multidrug-resistant plasmid that was not present in non-NICU isolates. Neonatal blood culture isolates were indistinguishable from environmental S. capitis isolates found on fomites, such as stethoscopes and neonatal incubators, but were generally distinct from those isolates carried by NICU staff. This work implicates the NICU environment as a potential reservoir for neonatal sepsis caused by S. capitis and highlights the capacity of genomics-based tracking and surveillance to inform future hospital infection control practices aimed at containing the spread of this important neonatal pathogen. Copyright © 2018 Carter et al.

September 22, 2019

Computational tools to unmask transposable elements.

A substantial proportion of the genome of many species is derived from transposable elements (TEs). Moreover, through various self-copying mechanisms, TEs continue to proliferate in the genomes of most species. TEs have contributed numerous regulatory, transcript and protein innovations and have also been linked to disease. However, notwithstanding their demonstrated impact, many genomic studies still exclude them because their repetitive nature results in various analytical complexities. Fortunately, a growing array of methods and software tools are being developed to cater for them. This Review presents a summary of computational resources for TEs and highlights some of the challenges and remaining gaps to perform comprehensive genomic analyses that do not simply ‘mask’ repeats.

September 22, 2019

Genomic characterization reveals significant divergence within Chlorella sorokiniana (Chlorellales, Trebouxiophyceae)

Selection of highly productive algal strains is crucial for establishing economically viable biomass and biopro- duct cultivation systems. Characterization of algal genomes, including understanding strain-specific differences in genome content and architecture is a critical step in this process. Using genomic analyses, we demonstrate significant differences between three strains of Chlorella sorokiniana (strain 1228, UTEX 1230, and DOE1412). We found that unique, strain-specific genes comprise a substantial proportion of each genome, and genomic regions with> 80% local nucleotide identity constitute <15% of each genome among the strains, indicating substantial strain specific evolution. Furthermore, cataloging of meiosis and other sex-related genes in C. sor- okiniana strains suggests strategic breeding could be utilized to improve biomass and bioproduct yields if a sexual cycle can be characterized. Finally, preliminary investigation of epigenetic machinery suggests the pre- sence of potentially unique transcriptional regulation in each strain. Our data demonstrate that these three C. sorokiniana strains represent significantly different genomic content. Based on these findings, we propose in- dividualized assessment of each strain for potential performance in cultivation systems.

September 22, 2019

TranSurVeyor: an improved database-free algorithm for finding non-reference transpositions in high-throughput sequencing data.

Transpositions transfer DNA segments between different loci within a genome; in particular, when a transposition is found in a sample but not in a reference genome, it is called a non-reference transposition. They are important structural variations that have clinical impact. Transpositions can be called by analyzing second generation high-throughput sequencing datasets. Current methods follow either a database-based or a database-free approach. Database-based methods require a database of transposable elements. Some of them have good specificity; however this approach cannot detect novel transpositions, and it requires a good database of transposable elements, which is not yet available for many species. Database-free methods perform de novo calling of transpositions, but their accuracy is low. We observe that this is due to the misalignment of the reads; since reads are short and the human genome has many repeats, false alignments create false positive predictions while missing alignments reduce the true positive rate. This paper proposes new techniques to improve database-free non-reference transposition calling: first, we propose a realignment strategy called one-end remapping that corrects the alignments of reads in interspersed repeats; second, we propose a SNV-aware filter that removes some incorrectly aligned reads. By combining these two techniques and other techniques like clustering and positive-to-negative ratio filter, our proposed transposition caller TranSurVeyor shows at least 3.1-fold improvement in terms of F1-score over existing database-free methods. More importantly, even though TranSurVeyor does not use databases of prior information, its performance is at least as good as existing database-based methods such as MELT, Mobster and Retroseq. We also illustrate that TranSurVeyor can discover transpositions that are not known in the current database.

Auto Tag: Reference genome

Endogenous rRNA sequence variation can regulate stress response gene expression and phenotype.

How long are long tandem repeats? A challenge for current methods of whole-genome sequence assembly: The case of satellites in Caenorhabditis elegans.

Repeat elements organise 3D genome structure and mediate transcription in the filamentous fungus Epichloë festucae.

Loss of bacitracin resistance due to a large genomic deletion among Bacillus anthracis strains.

A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content

SKA: Split Kmer Analysis Toolkit for Bacterial Genomic Epidemiology

Physiological genomics of dietary adaptation in a marine herbivorous fish

Constant conflict between Gypsy LTR retrotransposons and CHH methylation within a stress-adapted mangrove genome.

Combining probabilistic alignments with read pair information improves accuracy of split-alignments.

Prevalence, antimicrobial resistance and phylogenetic characterization of Yersinia enterocolitica in retail poultry meat and swine feces in parts of China

The unique evolution of the pig LRC, a single KIR but expansion of LILR and a novel Ig receptor family.

Genomic analysis of multi-resistant Staphylococcus capitis associated with neonatal sepsis.

Computational tools to unmask transposable elements.

Genomic characterization reveals significant divergence within Chlorella sorokiniana (Chlorellales, Trebouxiophyceae)

TranSurVeyor: an improved database-free algorithm for finding non-reference transpositions in high-throughput sequencing data.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert