Single-molecule, real-time sequencing developed by Pacific BioSciences offers longer read lengths than the second-generation sequencing (SGS) technologies, making it well-suited for unsolved problems in genome, transcriptome, and epigenetics research. The highly-contiguous de novo assemblies using PacBio sequencing can close gaps in current reference assemblies and characterize structural variation (SV) in personal genomes. With longer reads, we can sequence through extended repetitive regions and detect mutations, many of which are associated with diseases. Moreover, PacBio transcriptome sequencing is advantageous for the identification of gene isoforms and facilitates reliable discoveries of novel genes and novel isoforms of annotated genes, due to its ability to sequence full-length transcripts or fragments with significant lengths. Additionally, PacBio’s sequencing technique provides information that is useful for the direct detection of base modifications, such as methylation. In addition to using PacBio sequencing alone, many hybrid sequencing strategies have been developed to make use of more accurate short reads in conjunction with PacBio long reads. In general, hybrid sequencing strategies are more affordable and scalable especially for small-size laboratories than using PacBio Sequencing alone. The advent of PacBio sequencing has made available much information that could not be obtained via SGS alone. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Evidence of the red-queen hypothesis from accelerated rates of evolution of genes involved in biotic interactions in Pneumocystis.
Pneumocystis species are ascomycete fungi adapted to live inside the lungs of mammals. These ascomycetes show extensive stenoxenism, meaning that each species of Pneumocystis infects a single species of host. Here, we study the effect exerted by natural selection on gene evolution in the genomes of three Pneumocystis species. We show that genes involved in host interaction evolve under positive selection. In the first place, we found strong evidence of episodic diversifying selection in Major surface glycoproteins (Msg). These proteins are located on the surface of Pneumocystis and are used for host attachment and probably for immune system evasion. Consistent with their function as antigens, most sites under diversifying selection in Msg code for residues with large relative surface accessibility areas. We also found evidence of positive selection in part of the cell machinery used to export Msg to the cell surface. Specifically, we found that genes participating in glycosylphosphatidylinositol (GPI) biosynthesis show an increased rate of nonsynonymous substitutions (dN) versus synonymous substitutions (dS). GPI is a molecule synthesized in the endoplasmic reticulum that is used to anchor proteins to membranes. We interpret the aforementioned findings as evidence of selective pressure exerted by the host immune system on Pneumocystis species, shaping the evolution of Msg and several proteins involved in GPI biosynthesis. We suggest that genome evolution in Pneumocystis is well described by the Red-Queen hypothesis whereby genes relevant for biotic interactions show accelerated rates of evolution.
Genome and secretome analysis of Pochonia chlamydosporia provide new insight into egg-parasitic mechanisms.
Pochonia chlamydosporia infects eggs and females of economically important plant-parasitic nematodes. The fungal isolates parasitizing different nematodes are genetically distinct. To understand their intraspecific genetic differentiation, parasitic mechanisms, and adaptive evolution, we assembled seven putative chromosomes of P. chlamydosporia strain 170 isolated from root-knot nematode eggs (~44?Mb, including 7.19% of transposable elements) and compared them with the genome of the strain 123 (~41?Mb) isolated from cereal cyst nematode. We focus on secretomes of the fungus, which play important roles in pathogenicity and fungus-host/environment interactions, and identified 1,750 secreted proteins, with a high proportion of carboxypeptidases, subtilisins, and chitinases. We analyzed the phylogenies of these genes and predicted new pathogenic molecules. By comparative transcriptome analysis, we found that secreted proteins involved in responses to nutrient stress are mainly comprised of proteases and glycoside hydrolases. Moreover, 32 secreted proteins undergoing positive selection and 71 duplicated gene pairs encoding secreted proteins are identified. Two duplicated pairs encoding secreted glycosyl hydrolases (GH30), which may be related to fungal endophytic process and lost in many insect-pathogenic fungi but exist in nematophagous fungi, are putatively acquired from bacteria by horizontal gene transfer. The results help understanding genetic origins and evolution of parasitism-related genes.
Plant genome size varies by four orders of magnitude, and most of this variation stems from dynamic changes in repetitive DNA content. Here we report the small 109?Mb genome of Selaginella lepidophylla, a clubmoss with extreme desiccation tolerance. Single-molecule sequencing enables accurate haplotype assembly of a single heterozygous S. lepidophylla plant, revealing extensive structural variation. We observe numerous haplotype-specific deletions consisting of largely repetitive and heavily methylated sequences, with enrichment in young Gypsy LTR retrotransposons. Such elements are active but rapidly deleted, suggesting “bloat and purge” to maintain a small genome size. Unlike all other land plant lineages, Selaginella has no evidence of a whole-genome duplication event in its evolutionary history, but instead shows unique tandem gene duplication patterns reflecting adaptation to extreme drying. Gene expression changes during desiccation in S. lepidophylla mirror patterns observed across angiosperm resurrection plants.
Pangenome analyses of the wheat pathogen Zymoseptoria tritici reveal the structural basis of a highly plastic eukaryotic genome.
Structural variation contributes substantially to polymorphism within species. Chromosomal rearrangements that impact genes can lead to functional variation among individuals and influence the expression of phenotypic traits. Genomes of fungal pathogens show substantial chromosomal polymorphism that can drive virulence evolution on host plants. Assessing the adaptive significance of structural variation is challenging, because most studies rely on inferences based on a single reference genome sequence.We constructed and analyzed the pangenome of Zymoseptoria tritici, a major pathogen of wheat that evolved host specialization by chromosomal rearrangements and gene deletions. We used single-molecule real-time sequencing and high-density genetic maps to assemble multiple genomes. We annotated the gene space based on transcriptomics data that covered the infection life cycle of each strain. Based on a total of five telomere-to-telomere genomes, we constructed a pangenome for the species and identified a core set of 9149 genes. However, an additional 6600 genes were exclusive to a subset of the isolates. The substantial accessory genome encoded on average fewer expressed genes but a larger fraction of the candidate effector genes that may interact with the host during infection. We expanded our analyses of the pangenome to a worldwide collection of 123 isolates of the same species. We confirmed that accessory genes were indeed more likely to show deletion polymorphisms and loss-of-function mutations compared to core genes.The pangenome construction of a highly polymorphic eukaryotic pathogen showed that a single reference genome significantly underestimates the gene space of a species. The substantial accessory genome provides a cradle for adaptive evolution.
Nuclear and mitochondrial genomes of the hybrid fungal plant pathogen Verticillium longisporum display a mosaic structure
Allopolyploidization, genome duplication through interspecific hybridization, is an important evolutionary mechanism that can enable organisms to adapt to environmental changes or stresses. This increased adaptive potential of allopolyploids can be particularly relevant for plant pathogens in their quest for host immune response evasion. Allodiploidization likely caused the shift in host range of the fungal pathogen plant Verticillium longisporum, as V. longisporum mainly infects Brassicaceae plants in contrast to haploid Verticillium spp. In this study, we investigated the allodiploid genome structure of V. longisporum and its evolution in the hybridization aftermath. The nuclear genome of V. longisporum displays a mosaic structure, as numerous contigs consists of sections of both parental origins. V. longisporum encountered extensive genome rearrangements, whereas the contribution of gene conversion is negligible. Thus, the mosaic genome structure mainly resulted from genomic rearrangements between parental chromosome sets. Furthermore, a mosaic structure was also found in the mitochondrial genome, demonstrating its bi-parental inheritance. In conclusion, the nuclear and mitochondrial genomes of V. longisporum parents interacted dynamically in the hybridization aftermath. Conceivably, novel combinations of DNA sequence of different parental origin facilitated genome stability after hybridization and consecutive niche adaptation of V. longisporum.
The genome of the Hi5 germ cell line from Trichoplusia ni, an agricultural pest and novel model for small RNA biology.
We report a draft assembly of the genome of Hi5 cells from the lepidopteran insect pest,Trichoplusia ni, assigning 90.6% of bases to one of 28 chromosomes and predicting 14,037 protein-coding genes. Chemoreception and detoxification gene families revealT. ni-specific gene expansions that may explain its widespread distribution and rapid adaptation to insecticides. Transcriptome and small RNA data from thorax, ovary, testis, and the germline-derived Hi5 cell line show distinct expression profiles for 295 microRNA- and >393 piRNA-producing loci, as well as 39 genes encoding small RNA pathway proteins. Nearly all of the W chromosome is devoted to piRNA production, andT. nisiRNAs are not 2´-O-methylated. To enable use of Hi5 cells as a model system, we have established genome editing and single-cell cloning protocols. TheT. nigenome provides insights into pest control and allows Hi5 cells to become a new tool for studying small RNAs ex vivo.© 2018, Fu et al.
Repetitive DNA plays a fundamental role in the organization, size and evolution of eukaryotic genomes. The sequencing of the turbot revealed a small and compact genome, as in all flatfish studied to date. The assembly of repetitive regions is still incomplete because it is difficult to correctly identify their position, number and array. The combination of classical cytogenetic techniques along with high quality sequencing is essential to increase the knowledge of the structure and composition of these sequences and, thus, of the structure and function of the whole genome. In this work, the in silico analysis of H1 histone, 5S rDNA, telomeric and Rex repetitive sequences, was compared to their chromosomal mapping by fluorescent in situ hybridization (FISH), providing a more comprehensive picture of these elements in the turbot genome. FISH assays confirmed the location of H1 in LG8; 5S rDNA in LG4 and LG6; telomeric sequences at the end of all chromosomes whereas Rex elements were dispersed along most chromosomes. The discrepancies found between both approaches could be related to the sequencing methodology applied in this species and also to the resolution limitations of the FISH technique. Turbot cytogenomic analyses have proven to add new chromosomal landmarks in the karyotype of this species, representing a powerful tool to investigate targeted genomic sequences or regions in the genetic and physical maps of this species. Copyright © 2017 Elsevier B.V. All rights reserved.
Genome sequences of Chlorella sorokiniana UTEX 1602 and Micractinium conductrix SAG 241.80: implications to maltose excretion by a green alga.
Green algae represent a key segment of the global species capable of photoautotrophic-driven biological carbon fixation. Algae partition fixed-carbon into chemical compounds required for biomass, while diverting excess carbon into internal storage compounds such as starch and lipids or, in certain cases, into targeted extracellular compounds. Two green algae were selected to probe for critical components associated with sugar production and release in a model alga. Chlorella sorokiniana UTEX 1602 – which does not release significant quantities of sugars to the extracellular space – was selected as a control to compare with the maltose-releasing Micractinium conductrix SAG 241.80 – which was originally isolated from an endosymbiotic association with the ciliate Paramecium bursaria. Both strains were subjected to three sequencing approaches to assemble their genomes and annotate their genes. This analysis was further complemented with transcriptional studies during maltose release by M. conductrix SAG 241.80 versus conditions where sugar release is minimal. The annotation revealed that both strains contain homologs for the key components of a putative pathway leading to cytosolic maltose accumulation, while transcriptional studies found few changes in mRNA levels for the genes associated with these established intracellular sugar pathways. A further analysis of potential sugar transporters found multiple homologs for SWEETs and tonoplast sugar transporters. The analysis of transcriptional differences revealed a lesser and more measured global response for M. conductrix SAG 241.80 versus C. sorokiniana UTEX 1602 during conditions resulting in sugar release, providing a catalog of genes that might play a role in extracellular sugar transport.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
Reference quality genome assemblies of three Parastagonospora nodorum isolates differing in virulence on wheat.
Parastagonospora nodorum, the causal agent of Septoria nodorum blotch in wheat, has emerged as a model necrotrophic fungal organism for the study of host-microbe interactions. To date, three necrotrophic effectors have been identified and characterized from this pathogen, including SnToxA, SnTox1, and SnTox3. Necrotrophic effector identification was greatly aided by the development of a draft genome of Australian isolate SN15 via Sanger sequencing, yet it remained largely fragmented. This research presents the development of nearly finished genomes of P. nodorum isolates Sn4, Sn2000, and Sn79-1087 using long-read sequencing technology. RNAseq analysis of isolate Sn4, consisting of eight time points covering various developmental and infection stages, mediated the annotation of 13,379 genes. Analysis of these genomes revealed large-scale polymorphism between the three isolates, including the complete absence of contig 23 from isolate Sn79-1087, and a region of genome expansion on contig 10 in isolates Sn4 and Sn2000. Additionally, these genomes exhibit the hallmark characteristics of a “two-speed” genome, being partitioned into two distinct GC-equilibrated and AT-rich compartments. Interestingly, isolate Sn79-1087 contains a lower proportion of AT-rich segments, indicating a potential lack of evolutionary hotspots. These newly sequenced genomes, consisting of telomere-to-telomere assemblies of nearly all 23 P. nodorum chromosomes, provide a robust foundation for the further examination of effector biology and genome evolution. Copyright © 2018 Richards et al.
De novo assembly and phasing of dikaryotic genomes from two isolates of Puccinia coronata f. sp. avenae, the causal agent of oat crown rust.
Oat crown rust, caused by the fungus Pucinnia coronata f. sp. avenae, is a devastating disease that impacts worldwide oat production. For much of its life cycle, P. coronata f. sp. avenae is dikaryotic, with two separate haploid nuclei that may vary in virulence genotype, highlighting the importance of understanding haplotype diversity in this species. We generated highly contiguous de novo genome assemblies of two P. coronata f. sp. avenae isolates, 12SD80 and 12NC29, from long-read sequences. In total, we assembled 603 primary contigs for 12SD80, for a total assembly length of 99.16 Mbp, and 777 primary contigs for 12NC29, for a total length of 105.25 Mbp; approximately 52% of each genome was assembled into alternate haplotypes. This revealed structural variation between haplotypes in each isolate equivalent to more than 2% of the genome size, in addition to about 260,000 and 380,000 heterozygous single-nucleotide polymorphisms in 12SD80 and 12NC29, respectively. Transcript-based annotation identified 26,796 and 28,801 coding sequences for isolates 12SD80 and 12NC29, respectively, including about 7,000 allele pairs in haplotype-phased regions. Furthermore, expression profiling revealed clusters of coexpressed secreted effector candidates, and the majority of orthologous effectors between isolates showed conservation of expression patterns. However, a small subset of orthologs showed divergence in expression, which may contribute to differences in virulence between 12SD80 and 12NC29. This study provides the first haplotype-phased reference genome for a dikaryotic rust fungus as a foundation for future studies into virulence mechanisms in P. coronata f. sp. avenaeIMPORTANCE Disease management strategies for oat crown rust are challenged by the rapid evolution of Puccinia coronata f. sp. avenae, which renders resistance genes in oat varieties ineffective. Despite the economic importance of understanding P. coronata f. sp. avenae, resources to study the molecular mechanisms underpinning pathogenicity and the emergence of new virulence traits are lacking. Such limitations are partly due to the obligate biotrophic lifestyle of P. coronata f. sp. avenae as well as the dikaryotic nature of the genome, features that are also shared with other important rust pathogens. This study reports the first release of a haplotype-phased genome assembly for a dikaryotic fungal species and demonstrates the amenability of using emerging technologies to investigate genetic diversity in populations of P. coronata f. sp. avenae. Copyright © 2018 Miller et al.
Comparative heterochromatin profiling reveals conserved and unique epigenome signatures linked to adaptation and development of malaria parasites.
Heterochromatin-dependent gene silencing is central to the adaptation and survival of Plasmodium falciparum malaria parasites, allowing clonally variant gene expression during blood infection in humans. By assessing genome-wide heterochromatin protein 1 (HP1) occupancy, we present a comprehensive analysis of heterochromatin landscapes across different Plasmodium species, strains, and life cycle stages. Common targets of epigenetic silencing include fast-evolving multi-gene families encoding surface antigens and a small set of conserved HP1-associated genes with regulatory potential. Many P. falciparum heterochromatic genes are marked in a strain-specific manner, increasing the parasite’s adaptive capacity. Whereas heterochromatin is strictly maintained during mitotic proliferation of asexual blood stage parasites, substantial heterochromatin reorganization occurs in differentiating gametocytes and appears crucial for the activation of key gametocyte-specific genes and adaptation of erythrocyte remodeling machinery. Collectively, these findings provide a catalog of heterochromatic genes and reveal conserved and specialized features of epigenetic control across the genus Plasmodium. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
With approximately one-third of their genomes consisting of linear and circular plasmids, the Lyme disease agent cluster of species has the most complex genomes among known bacteria. We report here a comparative analysis of plasmids in eleven Borreliella (also known as Borrelia burgdorferi sensu lato) species.We sequenced the complete genomes of two B. afzelii, two B. garinii, and individual B. spielmanii, B. bissettiae, B. valaisiana and B. finlandensis isolates. These individual isolates carry between seven and sixteen plasmids, and together harbor 99 plasmids. We report here a comparative analysis of these plasmids, along with 70 additional Borreliella plasmids available in the public sequence databases. We identify only one new putative plasmid compatibility type (the 30th) among these 169 plasmid sequences, suggesting that all or nearly all such types have now been discovered. We find that the linear plasmids in the non-B. burgdorferi species have undergone the same kinds of apparently random, chaotic rearrangements mediated by non-homologous recombination that we previously discovered in B. burgdorferi. These rearrangements occurred independently in the different species lineages, and they, along with an expanded chromosomal phylogeny reported here, allow the identification of several whole plasmid transfer events among these species. Phylogenetic analyses of the plasmid partition genes show that a majority of the plasmid compatibility types arose early, most likely before separation of the Lyme agent Borreliella and relapsing fever Borrelia clades, and this, with occasional cross species plasmid transfers, has resulted in few if any species-specific or geographic region-specific Borreliella plasmid types.The primordial origin and persistent maintenance of the Borreliella plasmid types support their functional indispensability as well as evolutionary roles in facilitating genome diversity. The improved resolution of Borreliella plasmid phylogeny based on conserved partition-gene clusters will lead to better determination of gene orthology which is essential for prediction of biological function, and it will provide a basis for inferring detailed evolutionary mechanisms of Borreliella genomic variability including homologous gene and plasmid exchanges as well as non-homologous rearrangements.
DNA strand-exchange patterns associated with double-strand break-induced and spontaneous mitotic crossovers in Saccharomyces cerevisiae.
Mitotic recombination can result in loss of heterozygosity and chromosomal rearrangements that shape genome structure and initiate human disease. Engineered double-strand breaks (DSBs) are a potent initiator of recombination, but whether spontaneous events initiate with the breakage of one or both DNA strands remains unclear. In the current study, a crossover (CO)-specific assay was used to compare heteroduplex DNA (hetDNA) profiles, which reflect strand exchange intermediates, associated with DSB-induced versus spontaneous events in yeast. Most DSB-induced CO products had the two-sided hetDNA predicted by the canonical DSB repair model, with a switch in hetDNA position from one product to the other at the position of the break. Approximately 40% of COs, however, had hetDNA on only one side of the initiating break. This anomaly can be explained by a modified model in which there is frequent processing of an early invasion (D-loop) intermediate prior to extension of the invading end. Finally, hetDNA tracts exhibited complexities consistent with frequent expansion of the DSB into a gap, migration of strand-exchange junctions, and template switching during gap-filling reactions. hetDNA patterns in spontaneous COs isolated in either a wild-type background or in a background with elevated levels of reactive oxygen species (tsa1? mutant) were similar to those associated with the DSB-induced events, suggesting that DSBs are the major instigator of spontaneous mitotic recombination in yeast.
Trypanosoma cruzi, a zoonotic kinetoplastid protozoan with a complex genome, is the causative agent of American trypanosomiasis (Chagas disease). The parasite uses a highly diverse repertoire of surface molecules, with roles in cell invasion, immune evasion and pathogenesis. Thus far, the genomic regions containing these genes have been impossible to resolve and it has been impossible to study the structure and function of the several thousand repetitive genes encoding the surface molecules of the parasite. We here present an improved genome assembly of a T. cruzi clade I (TcI) strain using high coverage PacBio single molecule sequencing, together with Illumina sequencing of 34 T. cruzi TcI isolates and clones from different geographic locations, sample sources and clinical outcomes. Resolution of the surface molecule gene structure reveals an unusual duality in the organisation of the parasite genome, a core genomic region syntenous with related protozoa flanked by unique and highly plastic subtelomeric regions encoding surface antigens. The presence of abundant interspersed retrotransposons in the subtelomeres suggests that these elements are involved in a recombination mechanism for the generation of antigenic variation and evasion of the host immune response. The comparative genomic analysis of the cohort of TcI strains revealed multiple cases of such recombination events involving surface molecule genes and has provided new insights into T. cruzi population structure.