Menu
September 22, 2019  |  

Targeted genotyping of variable number tandem repeats with adVNTR.

Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6-100 bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. Although existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole-genome sequencing reads remains challenging. We describe a method, adVNTR, that uses hidden Markov models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single-molecule (Pacific Biosciences [PacBio]) whole-genome and whole-exome sequencing, and show good results on multiple simulated and real data sets.© 2018 Bakhtiari et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019  |  

A continuous genome assembly of the corkwing wrasse (Symphodus melops).

The wrasses (Labridae) are one of the most successful and species-rich families of the Perciformes order of teleost fish. Its members display great morphological diversity, and occupy distinct trophic levels in coastal waters and coral reefs. The cleaning behaviour displayed by some wrasses, such as corkwing wrasse (Symphodus melops), is of particular interest for the salmon aquaculture industry to combat and control sea lice infestation as an alternative to chemicals and pharmaceuticals. There are still few genome assemblies available within this fish family for comparative and functional studies, despite the rapid increase in genome resources generated during the past years. Here, we present a highly continuous genome assembly of the corkwing wrasse using PacBio SMRT sequencing (x28.8) followed by error correction with paired-end Illumina data (x132.9). The present genome assembly consists of 5040 contigs (N50?=?461,652?bp) and a total size of 614 Mbp, of which 8.5% of the genome sequence encode known repeated elements. The genome assembly covers 94.21% of highly conserved genes across ray-finned fish species. We find evidence for increased copy numbers specific for corkwing wrasse possibly highlighting diversification and adaptive processes in gene families including N-linked glycosylation (ST8SIA6) and stress response kinases (HIPK1). By comparative analyses, we discover that de novo repeats, often not properly investigated during genome annotation, encode hundreds of immune-related genes. This new genomic resource, together with the ballan wrasse (Labrus bergylta), will allow for in-depth comparative genomics as well as population genetic analyses for the understudied wrasses. Copyright © 2018 Elsevier Inc. All rights reserved.


September 22, 2019  |  

Constant conflict between Gypsy LTR retrotransposons and CHH methylation within a stress-adapted mangrove genome.

The evolutionary dynamics of the conflict between transposable elements (TEs) and their host genome remain elusive. This conflict will be intense in stress-adapted plants as stress can often reactivate TEs. Mangroves reduce TE load convergently in their adaptation to intertidal environments and thus provide a unique opportunity to address the host-TE conflict and its interaction with stress adaptation. Using the mangrove Rhizophora apiculata as a model, we investigated methylation and short interfering RNA (siRNA) targeting patterns in relation to the abundance and age of long terminal repeat (LTR) retrotransposons. We also examined the distance of LTR retrotransposons to genes, the impact on neighboring gene expression and population frequencies. We found differential accumulation amongst classes of LTR retrotransposons despite high overall methylation levels. This can be attributed to 24-nucleotide siRNA-mediated CHH methylation preferentially targeting Gypsy elements, particularly in their LTR regions. Old Gypsy elements possess unusually abundant siRNAs which show cross-mapping to young copies. Gypsy elements appear to be closer to genes and under stronger purifying selection than other classes. Our results suggest a continuous host-TE battle masked by the TE load reduction in R. apiculata. This conflict may enable mangroves, such as R. apiculata, to maintain genetic diversity and thus evolutionary potential during stress adaptation.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.


September 22, 2019  |  

Thermosipho spp. immune system differences affect variation in genome size and geographical distributions.

Thermosipho species inhabit thermal environments such as marine hydrothermal vents, petroleum reservoirs, and terrestrial hot springs. A 16S rRNA phylogeny of available Thermosipho spp. sequences suggested habitat specialists adapted to living in hydrothermal vents only, and habitat generalists inhabiting oil reservoirs, hydrothermal vents, and hotsprings. Comparative genomics of 15 Thermosipho genomes separated them into three distinct species with different habitat distributions: The widely distributed T. africanus and the more specialized, T. melanesiensis and T. affectus. Moreover, the species can be differentiated on the basis of genome size (GS), genome content, and immune system composition. For instance, the T. africanus genomes are largest and contained the most carbohydrate metabolism genes, which could explain why these isolates were obtained from ecologically more divergent habitats. Nonetheless, all the Thermosipho genomes, like other Thermotogae genomes, show evidence of genome streamlining. GS differences between the species could further be correlated to differences in defense capacities against foreign DNA, which influence recombination via HGT. The smallest genomes are found in T. affectus that contain both CRISPR-cas Type I and III systems, but no RM system genes. We suggest that this has caused these genomes to be almost devoid of mobile elements, contrasting the two other species genomes that contain a higher abundance of mobile elements combined with different immune system configurations. Taken together, the comparative genomic analyses of Thermosipho spp. revealed genetic variation allowing habitat differentiation within the genus as well as differentiation with respect to invading mobile DNA.


September 22, 2019  |  

The unique evolution of the pig LRC, a single KIR but expansion of LILR and a novel Ig receptor family.

The leukocyte receptor complex (LRC) encodes numerous immunoglobulin (Ig)-like receptors involved in innate immunity. These include the killer-cell Ig-like receptors (KIR) and the leukocyte Ig-like receptors (LILR) which can be polymorphic and vary greatly in number between species. Using the recent long-read genome assembly, Sscrofa11.1, we have characterized the porcine LRC on chromosome 6. We identified a ~?197-kb region containing numerous LILR genes that were missing in previous assemblies. Out of 17 such LILR genes and fragments, six encode functional proteins, of which three are inhibitory and three are activating, while the majority of pseudogenes had the potential to encode activating receptors. Elsewhere in the LRC, between FCAR and GP6, we identified a novel gene that encodes two Ig-like domains and a long inhibitory intracellular tail. Comparison with two other porcine assemblies revealed a second, nearly identical, non-functional gene encoding a short intracellular tail with ambiguous function. These novel genes were found in a diverse range of mammalian species, including a pseudogene in humans, and typically consist of a single long-tailed receptor and a variable number of short-tailed receptors. Using porcine transcriptome data, both the novel inhibitory gene and the LILR were highly expressed in peripheral blood, while the single KIR gene, KIR2DL1, was either very poorly expressed or not at all. These observations are a prerequisite for improved understanding of immune cell functions in the pig and other species.


September 22, 2019  |  

Genomic analysis of multi-resistant Staphylococcus capitis associated with neonatal sepsis.

Coagulase-negative staphylococci (CoNS), such as Staphylococcus capitis, are major causes of bloodstream infections in neonatal intensive care units (NICUs). Recently, a distinct clone of S. capitis (designated S. capitis NRCS-A) has emerged as an important pathogen in NICUs internationally. Here, 122 S. capitis isolates from New Zealand (NZ) underwent whole-genome sequencing (WGS), and these data were supplemented with publicly available S. capitis sequence reads. Phylogenetic and comparative genomic analyses were performed, as were phenotypic assessments of antimicrobial resistance, biofilm formation, and plasmid segregational stability on representative isolates. A distinct lineage of S. capitis was identified in NZ associated with neonates and the NICU environment. Isolates from this lineage produced increased levels of biofilm, displayed higher levels of tolerance to chlorhexidine, and were multidrug resistant. Although similar to globally circulating NICU-associated S. capitis strains at a core-genome level, NZ NICU S. capitis isolates carried a novel stably maintained multidrug-resistant plasmid that was not present in non-NICU isolates. Neonatal blood culture isolates were indistinguishable from environmental S. capitis isolates found on fomites, such as stethoscopes and neonatal incubators, but were generally distinct from those isolates carried by NICU staff. This work implicates the NICU environment as a potential reservoir for neonatal sepsis caused by S. capitis and highlights the capacity of genomics-based tracking and surveillance to inform future hospital infection control practices aimed at containing the spread of this important neonatal pathogen. Copyright © 2018 Carter et al.


September 22, 2019  |  

Computational tools to unmask transposable elements.

A substantial proportion of the genome of many species is derived from transposable elements (TEs). Moreover, through various self-copying mechanisms, TEs continue to proliferate in the genomes of most species. TEs have contributed numerous regulatory, transcript and protein innovations and have also been linked to disease. However, notwithstanding their demonstrated impact, many genomic studies still exclude them because their repetitive nature results in various analytical complexities. Fortunately, a growing array of methods and software tools are being developed to cater for them. This Review presents a summary of computational resources for TEs and highlights some of the challenges and remaining gaps to perform comprehensive genomic analyses that do not simply ‘mask’ repeats.


September 22, 2019  |  

Comparative genomic and methylome analysis of non-virulent D74 and virulent Nagasaki Haemophilus parasuis isolates.

Haemophilus parasuis is a respiratory pathogen of swine and the etiological agent of Glässer’s disease. H. parasuis isolates can exhibit different virulence capabilities ranging from lethal systemic disease to subclinical carriage. To identify genomic differences between phenotypically distinct strains, we obtained the closed whole-genome sequence annotation and genome-wide methylation patterns for the highly virulent Nagasaki strain and for the non-virulent D74 strain. Evaluation of the virulence-associated genes contained within the genomes of D74 and Nagasaki led to the discovery of a large number of toxin-antitoxin (TA) systems within both genomes. Five predicted hemolysins were identified as unique to Nagasaki and seven putative contact-dependent growth inhibition toxin proteins were identified only in strain D74. Assessment of all potential vtaA genes revealed thirteen present in the Nagasaki genome and three in the D74 genome. Subsequent evaluation of the predicted protein structure revealed that none of the D74 VtaA proteins contain a collagen triple helix repeat domain. Additionally, the predicted protein sequence for two D74 VtaA proteins is substantially longer than any predicted Nagasaki VtaA proteins. Fifteen methylation sequence motifs were identified in D74 and fourteen methylation sequence motifs were identified in Nagasaki using SMRT sequencing analysis. Only one of the methylation sequence motifs was observed in both strains indicative of the diversity between D74 and Nagasaki. Subsequent analysis also revealed diversity in the restriction-modification systems harbored by D74 and Nagasaki. The collective information reported in this study will aid in the development of vaccines and intervention strategies to decrease the prevalence and disease burden caused by H. parasuis.


September 22, 2019  |  

Whole-genome sequencing of Chinese yellow catfish provides a valuable genetic resource for high-throughput identification of toxin genes.

Naturally derived toxins from animals are good raw materials for drug development. As a representative venomous teleost, Chinese yellow catfish (Pelteobagrus fulvidraco) can provide valuable resources for studies on toxin genes. Its venom glands are located in the pectoral and dorsal fins. Although with such interesting biologic traits and great value in economy, Chinese yellow catfish is still lacking a sequenced genome. Here, we report a high-quality genome assembly of Chinese yellow catfish using a combination of next-generation Illumina and third-generation PacBio sequencing platforms. The final assembly reached 714 Mb, with a contig N50 of 970 kb and a scaffold N50 of 3.65 Mb, respectively. We also annotated 21,562 protein-coding genes, in which 97.59% were assigned at least one functional annotation. Based on the genome sequence, we analyzed toxin genes in Chinese yellow catfish. Finally, we identified 207 toxin genes and classified them into three major groups. Interestingly, we also expanded a previously reported sex-related region (to ˜6 Mb) in the achieved genome assembly, and localized two important toxin genes within this region. In summary, we assembled a high-quality genome of Chinese yellow catfish and performed high-throughput identification of toxin genes from a genomic view. Therefore, the limited number of toxin sequences in public databases will be remarkably improved once we integrate multi-omics data from more and more sequenced species.


September 22, 2019  |  

Genotype to phenotype: Diet-by-mitochondrial DNA haplotype interactions drive metabolic flexibility and organismal fitness.

Diet may be modified seasonally or by biogeographic, demographic or cultural shifts. It can differentially influence mitochondrial bioenergetics, retrograde signalling to the nuclear genome, and anterograde signalling to mitochondria. All these interactions have the potential to alter the frequencies of mtDNA haplotypes (mitotypes) in nature and may impact human health. In a model laboratory system, we fed four diets varying in Protein: Carbohydrate (P:C) ratio (1:2, 1:4, 1:8 and 1:16 P:C) to four homoplasmic Drosophila melanogaster mitotypes (nuclear genome standardised) and assayed their frequency in population cages. When fed a high protein 1:2 P:C diet, the frequency of flies harbouring Alstonville mtDNA increased. In contrast, when fed the high carbohydrate 1:16 P:C food the incidence of flies harbouring Dahomey mtDNA increased. This result, driven by differences in larval development, was generalisable to the replacement of the laboratory diet with fruits having high and low P:C ratios, perturbation of the nuclear genome and changes to the microbiome. Structural modelling and cellular assays suggested a V161L mutation in the ND4 subunit of complex I of Dahomey mtDNA was mildly deleterious, reduced mitochondrial functions, increased oxidative stress and resulted in an increase in larval development time on the 1:2 P:C diet. The 1:16 P:C diet triggered a cascade of changes in both mitotypes. In Dahomey larvae, increased feeding fuelled increased ß-oxidation and the partial bypass of the complex I mutation. Conversely, Alstonville larvae upregulated genes involved with oxidative phosphorylation, increased glycogen metabolism and they were more physically active. We hypothesise that the increased physical activity diverted energy from growth and cell division and thereby slowed development. These data further question the use of mtDNA as an assumed neutral marker in evolutionary and population genetic studies. Moreover, if humans respond similarly, we posit that individuals with specific mtDNA variations may differentially metabolise carbohydrates, which has implications for a variety of diseases including cardiovascular disease, obesity, and perhaps Parkinson’s Disease.


September 22, 2019  |  

An improved genome assembly for Larimichthys crocea reveals hepcidin gene expansion with diversified regulation and function.

Larimichthys crocea (large yellow croaker) is a type of perciform fish well known for its peculiar physiological properties and economic value. Here, we constructed an improved version of the L. crocea genome assembly, which contained 26,100 protein-coding genes. Twenty-four pseudo-chromosomes of L. crocea were also reconstructed, comprising 90% of the genome assembly. This improved assembly revealed several expansions in gene families associated with olfactory detection, detoxification, and innate immunity. Specifically, six hepcidin genes (LcHamps) were identified in L. crocea, possibly resulting from lineage-specific gene duplication. All LcHamps possessed similar genomic structures and functional domains, but varied substantially with respect to expression pattern, transcriptional regulation, and biological function. LcHamp1 was associated specifically with iron metabolism, while LcHamp2s were functionally diverse, involving in antibacterial activity, antiviral activity, and regulation of intracellular iron metabolism. This functional diversity among gene copies may have allowed L. crocea to adapt to diverse environmental conditions.


September 22, 2019  |  

Microevolution of Neisseria lactamica during nasopharyngeal colonisation induced by controlled human infection.

Neisseria lactamica is a harmless coloniser of the infant respiratory tract, and has a mutually-excluding relationship with the pathogen Neisseria meningitidis. Here we report controlled human infection with genomically-defined N. lactamica and subsequent bacterial microevolution during 26 weeks of colonisation. We find that most mutations that occur during nasopharyngeal carriage are transient indels within repetitive tracts of putative phase-variable loci associated with host-microbe interactions (pgl and lgt) and iron acquisition (fetA promotor and hpuA). Recurrent polymorphisms occurred in genes associated with energy metabolism (nuoN, rssA) and the CRISPR-associated cas1. A gene encoding a large hypothetical protein was often mutated in 27% of the subjects. In volunteers who were naturally co-colonised with meningococci, recombination altered allelic identity in N. lactamica to resemble meningococcal alleles, including loci associated with metabolism, outer membrane proteins and immune response activators. Our results suggest that phase variable genes are often mutated during carriage-associated microevolution.


September 22, 2019  |  

Acquired interbacterial defense systems protect against interspecies antagonism in the human gut microbiome

The genomes of bacteria derived from the gut microbiota are replete with pathways that mediate contact-dependent interbacterial antagonism. However, the role of direct interactions between co-resident microbes in driving microbiome composition is not well understood. Here we report the widespread occurrence of acquired interbacterial defense (AID) gene clusters in the human gut microbiome. These clusters are found on predicted mobile elements and encode arrays of immunity genes that confer protection against interbacterial toxin-mediated antagonism in vitro and in gnotobiotic mice. We find that Bacteroides ovatus strains containing AID systems that inactivate B. fragilis toxins delivered between cells by the type VI secretion system are enriched in samples lacking detectable B. fragilis. Moreover, these strains display significantly higher abundance in gut metagenomes than strains without AID systems. Finally, we identify a recombinase-associated AID subtype present broadly in Bacteroidales genomes with features suggestive of active gene acquisition. Our data suggest that neutralization of contact-dependent interbacterial antagonism via AID systems plays an important role in shaping human gut microbiome ecology.


September 22, 2019  |  

Genomic Tandem Quadruplication is Associated with Ketoconazole Resistance in Malassezia pachydermatis.

Malassezia pachydermatis is a commensal yeast found on the skin of dogs. However, M. pachydermatis is also considered an opportunistic pathogen and is associated with various canine skin diseases including otitis externa and atopic dermatitis, which usually require treatment using an azole antifungal drug, such as ketoconazole. In this study, we isolated a ketoconazole-resistant strain of M. pachydermatis, designated “KCTC 27587,” from the external ear canal of a dog with otitis externa and analyzed its resistance mechanism. To understand the mechanism underlying ketoconazole resistance of the clinical isolate M. pachydermatis KCTC 27587, the whole genome of the yeast was sequenced using the PacBio platform and was compared with M. pachydermatis type strain CBS 1879. We found that a ~84-kb region in chromosome 4 of M. pachydermatis KCTC 27587 was tandemly quadruplicated. The quadruplicated region contains 52 protein coding genes, including the homologs of ERG4 and ERG11, whose overexpression is known to be associated with azole resistance. Our data suggest that the quadruplication of the ~84-kb region may be the cause of the ketoconazole resistance in M. pachydermatis KCTC 27587.


September 22, 2019  |  

Construction of stable fluorescent laboratory control strains for several food safety relevant Enterobacteriaceae.

Using naturally-occurring bacterial strains as positive controls in testing protocols is typically feared due to the risk of cross-contaminating samples. We have developed a collection of strains which express Green Fluorescent Protein (GFP) at high-level, permitting rapid screening of the following species on selective or non-selective plates: Escherichia coli O157:H7, Shigella sonnei, S. flexneri, Salmonella enterica subsp. Enterica serovar Gaminera, S. Mbandaka, S. Tennesse, S. Minnesota, S. Senftenberg and S. Typhimurium. These new strains fluoresce when irradiated with UV light and maintain this phenotype in absence of antibiotic selection. Recombinants were phenotypically equivalent to the parent strain, except for S. Tennessee Sal66 that appeared Lac- on Xylose Lysine Deoxycholate (XLD) agar plates and Lac+ on Mac Conkey and Hektoen Enteric agar plates. Analysis of closed whole genome sequences revealed that Sal66 had lost one lactose operon; slower rates of lactose metabolism may affect lactose fermentation on XLD agar. These fluorescent enteric control strains were challenging to develop and should provide an easy and effective means of identifying cross-contamination. Published by Elsevier Ltd.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.