Menu
September 22, 2019

Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies.

Recent developments in third-gen long read sequencing and diploid-aware assemblers have resulted in the rapid release of numerous reference-quality assemblies for diploid genomes. However, assembly of highly heterozygous genomes is still problematic when regional heterogeneity is so high that haplotype homology is not recognised during assembly. This results in regional duplication rather than consolidation into allelic variants and can cause issues with downstream analysis, for example variant discovery, or haplotype reconstruction using the diploid assembly with unpaired allelic contigs.A new pipeline-Purge Haplotigs-was developed specifically for third-gen sequencing-based assemblies to automate the reassignment of allelic contigs, and to assist in the manual curation of genome assemblies. The pipeline uses a draft haplotype-fused assembly or a diploid assembly, read alignments, and repeat annotations to identify allelic variants in the primary assembly. The pipeline was tested on a simulated dataset and on four recent diploid (phased) de novo assemblies from third-generation long-read sequencing, and compared with a similar tool. After processing with Purge Haplotigs, haploid assemblies were less duplicated with minimal impact on genome completeness, and diploid assemblies had more pairings of allelic contigs.Purge Haplotigs improves the haploid and diploid representations of third-gen sequencing based genome assemblies by identifying and reassigning allelic contigs. The implementation is fast and scales well with large genomes, and it is less likely to over-purge repetitive or paralogous elements compared to alignment-only based methods. The software is available at https://bitbucket.org/mroachawri/purge_haplotigs under a permissive MIT licence.


September 22, 2019

Unexpected patterns of segregation distortion at a selfish supergene in the fire ant Solenopsis invicta.

The Sb supergene in the fire ant Solenopsis invicta determines the form of colony social organization, with colonies whose inhabitants bear the element containing multiple reproductive queens and colonies lacking it containing only a single queen. Several features of this supergene – including suppressed recombination, presence of deleterious mutations, association with a large centromere, and “green-beard” behavior – suggest that it may be a selfish genetic element that engages in transmission ratio distortion (TRD), defined as significant departures in progeny allele frequencies from Mendelian inheritance ratios. We tested this possibility by surveying segregation ratios in embryo progenies of 101 queens of the “polygyne” social form (3512 embryos) using three supergene-linked markers and twelve markers outside the supergene.Significant departures from Mendelian ratios were observed at the supergene loci in 3-5 times more progenies than expected in the absence of TRD and than found, on average, among non-supergene loci. Also, supergene loci displayed the greatest mean deviations from Mendelian ratios among all study loci, although these typically were modest. A surprising feature of the observed inter-progeny variation in TRD was that significant deviations involved not only excesses of supergene alleles but also similarly frequent excesses of the alternate alleles on the homologous chromosome. As expected given the common occurrence of such “drive reversal” in this system, alleles associated with the supergene gain no consistent transmission advantage over their alternate alleles at the population level. Finally, we observed low levels of recombination and incomplete gametic disequilibrium across the supergene, including between adjacent markers within a single inversion.Our data confirm the prediction that the Sb supergene is a selfish genetic element capable of biasing its own transmission during reproduction, yet counterselection for suppressor loci evidently has produced an evolutionary stalemate in TRD between the variant homologous haplotypes on the “social chromosome”. Evidence implicates prezygotic segregation distortion as responsible for the TRD we document, with “true” meiotic drive the most likely mechanism. Low levels of recombination and incomplete gametic disequilibrium across the supergene suggest that selection does not preserve a single uniform supergene haplotype responsible for inducing polygyny.


September 22, 2019

An improved genome assembly for Larimichthys crocea reveals hepcidin gene expansion with diversified regulation and function.

Larimichthys crocea (large yellow croaker) is a type of perciform fish well known for its peculiar physiological properties and economic value. Here, we constructed an improved version of the L. crocea genome assembly, which contained 26,100 protein-coding genes. Twenty-four pseudo-chromosomes of L. crocea were also reconstructed, comprising 90% of the genome assembly. This improved assembly revealed several expansions in gene families associated with olfactory detection, detoxification, and innate immunity. Specifically, six hepcidin genes (LcHamps) were identified in L. crocea, possibly resulting from lineage-specific gene duplication. All LcHamps possessed similar genomic structures and functional domains, but varied substantially with respect to expression pattern, transcriptional regulation, and biological function. LcHamp1 was associated specifically with iron metabolism, while LcHamp2s were functionally diverse, involving in antibacterial activity, antiviral activity, and regulation of intracellular iron metabolism. This functional diversity among gene copies may have allowed L. crocea to adapt to diverse environmental conditions.


September 22, 2019

Improved reference genome for the domestic horse increases assembly contiguity and composition.

Recent advances in genomic sequencing technology and computational assembly methods have allowed scientists to improve reference genome assemblies in terms of contiguity and composition. EquCab2, a reference genome for the domestic horse, was released in 2007. Although of equal or better quality compared to other first-generation Sanger assemblies, it had many of the shortcomings common to them. In 2014, the equine genomics research community began a project to improve the reference sequence for the horse, building upon the solid foundation of EquCab2 and incorporating new short-read data, long-read data, and proximity ligation data. Here, we present EquCab3. The count of non-N bases in the incorporated chromosomes is improved from 2.33?Gb in EquCab2 to 2.41?Gb in EquCab3. Contiguity has also been improved nearly 40-fold with a contig N50 of 4.5?Mb and scaffold contiguity enhanced to where all but one of the 32 chromosomes is comprised of a single scaffold.


September 22, 2019

Out in the cold: Identification of genomic regions associated with cold tolerance in the biocontrol fungus Clonostachys rosea through genome-wide association mapping.

There is an increasing importance for using biocontrol agents in combating plant diseases sustainably and in the long term. As large scale genomic sequencing becomes economically viable, the impact of single nucleotide polymorphisms (SNPs) on biocontrol-associated phenotypes can be easily studied across entire genomes of fungal populations. Here, we improved a previously reported genome assembly of the biocontrol fungus Clonostachys rosea strain IK726 using the PacBio sequencing platform, which resulted in a total genome size of 70.7 Mbp and 21,246 predicted genes. We further performed whole-genome re-sequencing of 52 additional C. rosea strains isolated globally using Illumina sequencing technology, in order to perform genome-wide association studies in conditions relevant for biocontrol activity. One such condition is the ability to grow at lower temperatures commonly encountered in cryic or frigid soils in temperate regions, as these will be prevalent for protecting growing crops in temperate climates. Growth rates at 10°C on potato dextrose agar of the 53 sequenced strains of C. rosea were measured and ranged between 0.066 and 0.413 mm/day. Performing a genome wide association study, a total of 1,478 SNP markers were significantly associated with the trait and located in 227 scaffolds, within or close to (< 1000 bp distance) 265 different genes. The predicted gene products included several chaperone proteins, membrane transporters, lipases, and proteins involved in chitin metabolism with possible roles in cold tolerance. The data reported in this study provides a foundation for future investigations into the genetic basis for cold tolerance in fungi, with important implications for biocontrol.


September 22, 2019

Microevolution of Neisseria lactamica during nasopharyngeal colonisation induced by controlled human infection.

Neisseria lactamica is a harmless coloniser of the infant respiratory tract, and has a mutually-excluding relationship with the pathogen Neisseria meningitidis. Here we report controlled human infection with genomically-defined N. lactamica and subsequent bacterial microevolution during 26 weeks of colonisation. We find that most mutations that occur during nasopharyngeal carriage are transient indels within repetitive tracts of putative phase-variable loci associated with host-microbe interactions (pgl and lgt) and iron acquisition (fetA promotor and hpuA). Recurrent polymorphisms occurred in genes associated with energy metabolism (nuoN, rssA) and the CRISPR-associated cas1. A gene encoding a large hypothetical protein was often mutated in 27% of the subjects. In volunteers who were naturally co-colonised with meningococci, recombination altered allelic identity in N. lactamica to resemble meningococcal alleles, including loci associated with metabolism, outer membrane proteins and immune response activators. Our results suggest that phase variable genes are often mutated during carriage-associated microevolution.


September 22, 2019

Genomic surveillance of Enterococcus faecium reveals limited sharing of strains and resistance genes between livestock and humans in the United Kingdom.

Vancomycin-resistant Enterococcus faecium (VREfm) is a major cause of nosocomial infection and is categorized as high priority by the World Health Organization global priority list of antibiotic-resistant bacteria. In the past, livestock have been proposed as a putative reservoir for drug-resistant E. faecium strains that infect humans, and isolates of the same lineage have been found in both reservoirs. We undertook cross-sectional surveys to isolate E. faecium (including VREfm) from livestock farms, retail meat, and wastewater treatment plants in the United Kingdom. More than 600 isolates from these sources were sequenced, and their relatedness and antibiotic resistance genes were compared with genomes of almost 800 E. faecium isolates from patients with bloodstream infection in the United Kingdom and Ireland. E. faecium was isolated from 28/29 farms; none of these isolates were VREfm, suggesting a decrease in VREfm prevalence since the last UK livestock survey in 2003. However, VREfm was isolated from 1% to 2% of retail meat products and was ubiquitous in wastewater treatment plants. Phylogenetic comparison demonstrated that the majority of human and livestock-related isolates were genetically distinct, although pig isolates from three farms were more genetically related to human isolates from 2001 to 2004 (minimum of 50?single-nucleotide polymorphisms [SNPs]). Analysis of accessory (variable) genes added further evidence for distinct niche adaptation. An analysis of acquired antibiotic resistance genes and their variants revealed limited sharing between humans and livestock. Our findings indicate that the majority of E. faecium strains infecting patients are largely distinct from those from livestock in this setting, with limited sharing of strains and resistance genes.IMPORTANCE The rise in rates of human infection caused by vancomycin-resistant Enterococcus faecium (VREfm) strains between 1988 to the 2000s in Europe was suggested to be associated with acquisition from livestock. As a result, the European Union banned the use of the glycopeptide drug avoparcin as a growth promoter in livestock feed. While some studies reported a decrease in VREfm in livestock, others reported no reduction. Here, we report the first livestock VREfm prevalence survey in the UK since 2003 and the first large-scale study using whole-genome sequencing to investigate the relationship between E. faecium strains in livestock and humans. We found a low prevalence of VREfm in retail meat and limited evidence for recent sharing of strains between livestock and humans with bloodstream infection. There was evidence for limited sharing of genes encoding antibiotic resistance between these reservoirs, a finding which requires further research. Copyright © 2018 Gouliouris et al.


September 22, 2019

Leishmania genome dynamics during environmental adaptation reveal strain-specific differences in gene copy number variation, karyotype instability, and telomeric amplification.

Protozoan parasites of the genus Leishmania adapt to environmental change through chromosome and gene copy number variations. Only little is known about external or intrinsic factors that govern Leishmania genomic adaptation. Here, by conducting longitudinal genome analyses of 10 new Leishmania clinical isolates, we uncovered important differences in gene copy number among genetically highly related strains and revealed gain and loss of gene copies as potential drivers of long-term environmental adaptation in the field. In contrast, chromosome rather than gene amplification was associated with short-term environmental adaptation to in vitro culture. Karyotypic solutions were highly reproducible but unique for a given strain, suggesting that chromosome amplification is under positive selection and dependent on species- and strain-specific intrinsic factors. We revealed a progressive increase in read depth towards the chromosome ends for various Leishmania isolates, which may represent a nonclassical mechanism of telomere maintenance that can preserve integrity of chromosome ends during selection for fast in vitro growth. Together our data draw a complex picture of Leishmania genomic adaptation in the field and in culture, which is driven by a combination of intrinsic genetic factors that generate strain-specific phenotypic variations, which are under environmental selection and allow for fitness gain.IMPORTANCE Protozoan parasites of the genus Leishmania cause severe human and veterinary diseases worldwide, termed leishmaniases. A hallmark of Leishmania biology is its capacity to adapt to a variety of unpredictable fluctuations inside its human host, notably pharmacological interventions, thus, causing drug resistance. Here we investigated mechanisms of environmental adaptation using a comparative genomics approach by sequencing 10 new clinical isolates of the L. donovani, L. major, and L. tropica complexes that were sampled across eight distinct geographical regions. Our data provide new evidence that parasites adapt to environmental change in the field and in culture through a combination of chromosome and gene amplification that likely causes phenotypic variation and drives parasite fitness gains in response to environmental constraints. This novel form of gene expression regulation through genomic change compensates for the absence of classical transcriptional control in these early-branching eukaryotes and opens new venues for biomarker discovery. Copyright © 2018 Bussotti et al.


September 22, 2019

N6-methyladenine DNA modification in Xanthomonas oryzae pv. oryzicola genome.

DNA N6-methyladenine (6mA) modifications expand the information capacity of DNA and have long been known to exist in bacterial genomes. Xanthomonas oryzae pv. Oryzicola (Xoc) is the causative agent of bacterial leaf streak, an emerging and destructive disease in rice worldwide. However, the genome-wide distribution patterns and potential functions of 6mA in Xoc are largely unknown. In this study, we analyzed the levels and global distribution patterns of 6mA modification in genomic DNA of seven Xoc strains (BLS256, BLS279, CFBP2286, CFBP7331, CFBP7341, L8 and RS105). The 6mA modification was found to be widely distributed across the seven Xoc genomes, accounting for percent of 3.80, 3.10, 3.70, 4.20, 3.40, 2.10, and 3.10 of the total adenines in BLS256, BLS279, CFBP2286, CFBP7331, CFBP7341, L8, and RS105, respectively. Notably, more than 82% of 6mA sites were located within gene bodies in all seven strains. Two specific motifs for 6?mA modification, ARGT and AVCG, were prevalent in all seven strains. Comparison of putative DNA methylation motifs from the seven strains reveals that Xoc have a specific DNA methylation system. Furthermore, the 6?mA modification of rpfC dramatically decreased during Xoc infection indicates the important role for Xoc adaption to environment.


September 22, 2019

The impact of genome evolution on the allotetraploid Nicotiana rustica – an intriguing story of enhanced alkaloid production.

Nicotiana rustica (Aztec tobacco), like common tobacco (Nicotiana tabacum), is an allotetraploid formed through a recent hybridization event; however, it originated from completely different progenitor species. Here, we report the comparative genome analysis of wild type N. rustica (5 Gb; 2n?=?4x?=?48) with its three putative diploid progenitors (2.3-3 Gb; 2n?=?2x =24), Nicotiana undulata, Nicotiana paniculata and Nicotiana knightiana.In total, 41% of N. rustica genome originated from the paternal donor (N. undulata), while 59% originated from the maternal donor (N. paniculata/N. knightiana). Chloroplast genome and gene analyses indicated that N. knightiana is more closely related to N. rustica than N. paniculata. Gene clustering revealed 14,623 ortholog groups common to other Nicotiana species and 207 unique to N. rustica. Genome sequence analysis indicated that N. knightiana is more closely related to N. rustica than N. paniculata, and that the higher nicotine content of N. rustica leaves is the result of the progenitor genomes combination and of a more active transport of nicotine to the shoot.The availability of four new Nicotiana genome sequences provide insights into how speciation impacts plant metabolism, and in particular alkaloid transport and accumulation, and will contribute to better understanding the evolution of Nicotiana species.


September 22, 2019

Functionality of two origins of replication in Vibrio cholerae strains with a single chromosome.

Chromosomal inheritance in bacteria usually entails bidirectional replication of a single chromosome from a single origin into two copies and subsequent partitioning of one copy each into daughter cells upon cell division. However, the human pathogen Vibrio cholerae and other Vibrionaceae harbor two chromosomes, a large Chr1 and a small Chr2. Chr1 and Chr2 have different origins, an oriC-type origin and a P1 plasmid-type origin, respectively, driving the replication of respective chromosomes. Recently, we described naturally occurring exceptions to the two-chromosome rule of Vibrionaceae: i.e., Chr1 and Chr2 fused single chromosome V. cholerae strains, NSCV1 and NSCV2, in which both origins of replication are present. Using NSCV1 and NSCV2, here we tested whether two types of origins of replication can function simultaneously on the same chromosome or one or the other origin is silenced. We found that in NSCV1, both origins are active whereas in NSCV2 ori2 is silenced despite the fact that it is functional in an isolated context. The ori2 activity appears to be primarily determined by the copy number of the triggering site, crtS, which in turn is determined by its location with respect to ori1 and ori2 on the fused chromosome.


September 22, 2019

Conjugative transfer of a novel Staphylococcal plasmid encoding the biocide resistance gene, qacA.

Staphylococcus aureus is the leading cause of skin and soft tissue infections (SSTI). Some S. aureus strains harbor plasmids that carry genes that affect resistance to biocides. Among these genes, qacA encodes the QacA Multidrug Efflux Pump that imparts decreased susceptibility to chlorhexidine, a biocide used ubiquitously in healthcare facilities. Furthermore, chlorhexidine has been considered as a S. aureus decolonization strategy in community settings. We previously conducted a chlorhexidine-based SSTI prevention trial among Ft. Benning Army trainees. Analysis of a clinical isolate (C02) from that trial identified a novel qacA-positive plasmid, pC02. Prior characterization of qacA-containing plasmids is limited and conjugative transfer of those plasmids has not been demonstrated. Given the implications of increased biocide resistance, herein we characterized pC02. In silico analysis identified genes typically associated with conjugative plasmids. Moreover, pC02 was efficiently transferred to numerous S. aureus strains and to Staphylococcus epidermidis. We screened additional qacA-positive S. aureus clinical isolates and pC02 was present in 27% of those strains; other unique qacA-harboring plasmids were also identified. Ten strains were subjected to whole genome sequencing. Sequence analysis combined with plasmid screening studies suggest that qacA-containing strains are transmitted among military personnel at Ft. Benning and that strains carrying qacA are associated with SSTIs within this population. The identification of a novel mechanism of qacA conjugative transfer among Staphylococcal strains suggests a possible future increase in the prevalence of antiseptic tolerant bacterial strains, and an increase in the rate of infections in settings where these agents are commonly used.


September 22, 2019

Reconstitution of eukaryotic chromosomes and manipulation of DNA N6-methyladenine alters chromatin and gene expression

DNA N6-adenine methylation (6mA) has recently been reported in diverse eukaryotes, spanning unicellular organisms to metazoans. Yet the functional significance of 6mA remains elusive due to its low abundance, difficulty of manipulation within native DNA, and lack of understanding of eukaryotic 6mA writers. Here, we report a novel DNA 6mA methyltransferase in ciliates, termed MTA1. The enzyme contains an MT-A70 domain but is phylogenetically distinct from all known RNA and DNA methyltransferases. Disruption of MTA1 in vivo leads to the genome-wide loss of 6mA in asexually growing cells and abolishment of the consensus ApT dimethylated motif. Genes exhibit subtle changes in chromatin organization or RNA expression upon loss of 6mA, depending on their starting methylation level. Mutants fail to complete the sexual cycle, which normally coincides with a peak of MTA1 expression. Thus, MTA1 functions in a developmental stage-specific manner. We determine the impact of 6mA on chromatin organization in vitro by reconstructing complete, full-length ciliate chromosomes harboring 6mA in native or ectopic positions. Using these synthetic chromosomes, we show that 6mA directly disfavors nucleosomes in vitro in a local, quantitative manner, independent of DNA sequence. Furthermore, the chromatin remodeler ACF can overcome this effect. Our study identifies a novel MT-A70 protein necessary for eukaryotic 6mA methylation and defines the impact of 6mA on chromatin organization using epigenetically defined synthetic chromosomes.


September 22, 2019

Genomic Tandem Quadruplication is Associated with Ketoconazole Resistance in Malassezia pachydermatis.

Malassezia pachydermatis is a commensal yeast found on the skin of dogs. However, M. pachydermatis is also considered an opportunistic pathogen and is associated with various canine skin diseases including otitis externa and atopic dermatitis, which usually require treatment using an azole antifungal drug, such as ketoconazole. In this study, we isolated a ketoconazole-resistant strain of M. pachydermatis, designated “KCTC 27587,” from the external ear canal of a dog with otitis externa and analyzed its resistance mechanism. To understand the mechanism underlying ketoconazole resistance of the clinical isolate M. pachydermatis KCTC 27587, the whole genome of the yeast was sequenced using the PacBio platform and was compared with M. pachydermatis type strain CBS 1879. We found that a ~84-kb region in chromosome 4 of M. pachydermatis KCTC 27587 was tandemly quadruplicated. The quadruplicated region contains 52 protein coding genes, including the homologs of ERG4 and ERG11, whose overexpression is known to be associated with azole resistance. Our data suggest that the quadruplication of the ~84-kb region may be the cause of the ketoconazole resistance in M. pachydermatis KCTC 27587.


September 22, 2019

Genome sequences of two diploid wild relatives of cultivated sweetpotato reveal targets for genetic improvement

Sweetpotato [Ipomoea batatas (L.) Lam.] is a globally important staple food crop, especially for sub-Saharan Africa. Agronomic improvement of sweetpotato has lagged behind other major food crops due to a lack of genomic and genetic resources and inherent challenges in breeding a heterozygous, clonally propagated polyploid. Here, we report the genome sequences of its two diploid relatives, I. trifida and I. triloba, and show that these high-quality genome assemblies are robust references for hexaploid sweetpotato. Comparative and phylogenetic analyses reveal insights into the ancient whole-genome triplication history of Ipomoea and evolutionary relationships within the Batatas complex. Using resequencing data from 16 genotypes widely used in African breeding programs, genes and alleles associated with carotenoid biosynthesis in storage roots are identified, which may enable efficient breeding of varieties with high provitamin A content. These resources will facilitate genome-enabled breeding in this important food security crop.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.