This meeting report highlights key trends that emerged from a conference entitled Post-Transcriptional Gene Regulation in Plants, which was held 14-15 July 2016, as a satellite meeting of the annual meeting of the American Society of Plant Biologists in Austin, Texas. The molecular biology of RNA is emerging as an integral part of the framework for plants’ responses to environmental challenges such as drought and heat, hypoxia, nutrient deprivation, light and pathogens. Moreover, the conference illustrated how a multitude of customized and pioneering omics-related technologies are being applied, more and more often in combination, to describe and dissect the complexities of gene expression at the post-transcriptional level.© 2016 John Wiley & Sons Ltd.
Application of circular consensus sequencing and network analysis to characterize the bovine IgG repertoire.
Vertebrate immune systems generate diverse repertoires of antibodies capable of mediating response to a variety of antigens. Next generation sequencing methods provide unique approaches to a number of immuno-based research areas including antibody discovery and engineering, disease surveillance, and host immune response to vaccines. In particular, single-molecule circular consensus sequencing permits the sequencing of antibody repertoires at previously unattainable depths of coverage and accuracy. We approached the bovine immunoglobulin G (IgG) repertoire with the objective of characterizing diversity of expressed IgG transcripts. Here we present single-molecule real-time sequencing data of expressed IgG heavy-chain repertoires of four individual cattle. We describe the diversity observed within antigen binding regions and visualize this diversity using a network-based approach.We generated 49,945 high quality cDNA sequences, each spanning the entire IgG variable region from four Bos taurus calves. From these sequences we identified 49,521 antigen binding regions using the automated Paratome web server. Approximately 9% of all unique complementarity determining 2 (CDR2) sequences were of variable lengths. A bimodal distribution of unique CDR3 sequence lengths was observed, with common lengths of 5-6 and 21-25 amino acids. The average number of cysteine residues in CDR3s increased with CDR3 length and we observed that cysteine residues were centrally located in CDR3s. We identified 19 extremely long CDR3 sequences (up to 62 amino acids in length) within IgG transcripts. Network analyses revealed distinct patterns among the expressed IgG antigen binding repertoires of the examined individuals.We utilized circular consensus sequencing technology to provide baseline data of the expressed bovine IgG repertoire that can be used for future studies important to livestock research. Somatic mutation resulting in base insertions and deletions in CDR2 further diversifies the bovine antibody repertoire. In contrast to previous studies, our data indicate that unusually long CDR3 sequences are not unique to IgM antibodies in cattle. Centrally located cysteine residues in bovine CDR3s provide further evidence that disulfide bond formation is likely of structural importance. We hypothesize that network or cluster-based analyses of expressed antibody repertoires from controlled challenge experiments will help identify novel natural antigen binding solutions to specific pathogens of interest.
The genomes of two fungi isolated from soil (MEA-2) and sediment (SUP5-1) were sequenced. Both were members of the order Hypocreales, closely related to Tolypocladium inflatum, and capable of producing novel secondary metabolites. The draft genomes enabled the characterization of key biosynthetic pathways. Copyright © 2015 Stamps et al.
The Florida manatee (Trichechus manatus latirostris) immunoglobulin heavy chain suggests the importance of clan III variable segments in repertoire diversity.
Manatees are a vulnerable, charismatic sentinel species from the evolutionarily divergent Afrotheria. Manatee health and resistance to infectious disease is of great concern to conservation groups, but little is known about their immune system. To develop manatee-specific tools for monitoring health, we first must have a general knowledge of how the immunoglobulin heavy (IgH) chain locus is organized and transcriptionally expressed. Using the genomic scaffolds of the Florida manatee (Trichechus manatus latirostris), we characterized the potential IgH segmental diversity and constant region isotypic diversity and performed the first Afrotherian repertoire analysis. The Florida manatee has low V(D)J combinatorial diversity (3744 potential combinations) and few constant region isotypes. They also lack clan III V segments, which may have caused reduced VH segment numbers. However, we found productive somatic hypermutation concentrated in the complementarity determining regions. In conclusion, manatees have limited IGHV clan and combinatorial diversity. This suggests that clan III V segments are essential for maintaining IgH locus diversity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Decline-diseases are complex and becoming increasingly problematic to tree health globally. Acute Oak Decline (AOD) is characterized by necrotic stem lesions and galleries of the bark-boring beetle, Agrilus biguttatus, and represents a serious threat to oak. Although multiple novel bacterial species and Agrilus galleries are associated with AOD lesions, the causative agent(s) are unknown. The AOD pathosystem therefore provides an ideal model for a systems-based research approach to address our hypothesis that AOD lesions are caused by a polymicrobial complex. Here we show that three bacterial species, Brenneria goodwinii, Gibbsiella quercinecans and Rahnella victoriana, are consistently abundant in the lesion microbiome and possess virulence genes used by canonical phytopathogens that are expressed in AOD lesions. Individual and polyspecies inoculations on oak logs and trees demonstrated that B. goodwinii and G. quercinecans cause tissue necrosis and, in combination with A. biguttatus, produce the diagnostic symptoms of AOD. We have proved a polybacterial cause of AOD lesions, providing new insights into polymicrobial interactions and tree disease. This work presents a novel conceptual and methodological template for adapting Koch’s postulates to address the role of microbial communities in disease.
Blood CXCR3+CD4 T cells are enriched in inducible replication competent HIV in aviremic antiretroviral therapy-treated individuals.
We recently demonstrated that lymph nodes (LNs) PD-1+/T follicular helper (Tfh) cells from antiretroviral therapy (ART)-treated HIV-infected individuals were enriched in cells containing replication competent virus. However, the distribution of cells containing inducible replication competent virus has been only partially elucidated in blood memory CD4 T-cell populations including the Tfh cell counterpart circulating in blood (cTfh). In this context, we have investigated the distribution of (1) total HIV-infected cells and (2) cells containing replication competent and infectious virus within various blood and LN memory CD4 T-cell populations of conventional antiretroviral therapy (cART)-treated HIV-infected individuals. In the present study, we show that blood CXCR3-expressing memory CD4 T cells are enriched in cells containing inducible replication competent virus and contributed the most to the total pool of cells containing replication competent and infectious virus in blood. Interestingly, subsequent proviral sequence analysis did not indicate virus compartmentalization between blood and LN CD4 T-cell populations, suggesting dynamic interchanges between the two compartments. We then investigated whether the composition of blood HIV reservoir may reflect the polarization of LN CD4 T cells at the time of reservoir seeding and showed that LN PD-1+CD4 T cells of viremic untreated HIV-infected individuals expressed significantly higher levels of CXCR3 as compared to CCR4 and/or CCR6, suggesting that blood CXCR3-expressing CD4 T cells may originate from LN PD-1+CD4 T cells. Taken together, these results indicate that blood CXCR3-expressing CD4 T cells represent the major blood compartment containing inducible replication competent virus in treated aviremic HIV-infected individuals.
Complete genome sequence of N2-fixing model strain Klebsiella sp. nov. M5al, which produces plant cell wall-degrading enzymes and siderophores.
The bacterial strain M5al is a model strain for studying the molecular genetics of N2-fixation and molecular engineering of microbial production of platform chemicals 1,3-propanediol and 2,3-butanediol. Here, we present the complete genome sequence of the strain M5al, which belongs to a novel species closely related toKlebsiella michiganensis. M5al secretes plant cell wall-degrading enzymes and colonizes rice roots but does not cause soft rot disease. M5al also produces siderophores and contains the gene clusters for synthesis and transport of yersiniabactin which is a critical virulence factor forKlebsiellapathogens in causing human disease. We propose that the model strain M5al can be genetically modified to study bacterial N2-fixation in association with non-legume plants and production of 1,3-propanediol and 2,3-butanediol through degradation of plant cell wall biomass.
Rapid allopolyploid radiation of moonwort ferns (Botrychium; Ophioglossaceae) revealed by PacBio sequencing of homologous and homeologous nuclear regions.
Polyploidy is a major speciation process in vascular plants, and is postulated to be particularly important in shaping the diversity of extant ferns. However, limitations in the availability of bi-parental markers for ferns have greatly limited phylogenetic investigation of polyploidy in this group. With a large number of allopolyploid species, the genus Botrychium is a classic example in ferns where recurrent polyploidy is postulated to have driven frequent speciation events. Here, we use PacBio sequencing and the PURC bioinformatics pipeline to capture all homeologous or allelic copies of four long (~1?kb) low-copy nuclear regions from a sample of 45 specimens (25 diploids and 20 polyploids) representing 37 Botrychium taxa, and three outgroups. This sample includes most currently recognized Botrychium species in Europe and North America, and the majority of our specimens were genotyped with co-dominant nuclear allozymes to ensure species identification. We analyzed the sequence data using maximum likelihood (ML) and Bayesian inference (BI) concatenated-data (“gene tree”) approaches to explore the relationships among Botrychium species. Finally, we estimated divergence times among Botrychium lineages and inferred the multi-labeled polyploid species tree showing the origins of the polyploid taxa, and their relationships to each other and to their diploid progenitors. We found strong support for the monophyly of the major lineages within Botrychium and identified most of the parental donors of the polyploids; these results largely corroborate earlier morphological and allozyme-based investigations. Each polyploid had at least two distinct homeologs, indicating that all sampled polyploids are likely allopolyploids (rather than autopolyploids). Our divergence-time analyses revealed that these allopolyploid lineages originated recently-within the last two million years-and thus that the genus has undergone a recent radiation, correlated with multiple independent allopolyploidizations across the phylogeny. Also, we found strong parental biases in the formation of allopolyploids, with individual diploid species participating multiple times as either the maternal or paternal donor (but not both). Finally, we discuss the role of polyploidy in the evolutionary history of Botrychium and the interspecific reproductive barriers possibly involved in these parental biases. Copyright © 2017 Elsevier Inc. All rights reserved.
Loss of stomach, loss of appetite? Sequencing of the ballan wrasse (Labrus bergylta) genome and intestinal transcriptomic profiling illuminate the evolution of loss of stomach function in fish.
The ballan wrasse (Labrus bergylta) belongs to a large teleost family containing more than 600 species showing several unique evolutionary traits such as lack of stomach and hermaphroditism. Agastric fish are found throughout the teleost phylogeny, in quite diverse and unrelated lineages, indicating stomach loss has occurred independently multiple times in the course of evolution. By assembling the ballan wrasse genome and transcriptome we aimed to determine the genetic basis for its digestive system function and appetite regulation. Among other, this knowledge will aid the formulation of aquaculture diets that meet the nutritional needs of agastric species.Long and short read sequencing technologies were combined to generate a ballan wrasse genome of 805 Mbp. Analysis of the genome and transcriptome assemblies confirmed the absence of genes that code for proteins involved in gastric function. The gene coding for the appetite stimulating protein ghrelin was also absent in wrasse. Gene synteny mapping identified several appetite-controlling genes and their paralogs previously undescribed in fish. Transcriptome profiling along the length of the intestine found a declining expression gradient from the anterior to the posterior, and a distinct expression profile in the hind gut.We showed gene loss has occurred for all known genes related to stomach function in the ballan wrasse, while the remaining functions of the digestive tract appear intact. The results also show appetite control in ballan wrasse has undergone substantial changes. The loss of ghrelin suggests that other genes, such as motilin, may play a ghrelin like role. The wrasse genome offers novel insight in to the evolutionary traits of this large family. As the stomach plays a major role in protein digestion, the lack of genes related to stomach digestion in wrasse suggests it requires formulated diets with higher levels of readily digestible protein than those for gastric species.
DNA strand-exchange patterns associated with double-strand break-induced and spontaneous mitotic crossovers in Saccharomyces cerevisiae.
Mitotic recombination can result in loss of heterozygosity and chromosomal rearrangements that shape genome structure and initiate human disease. Engineered double-strand breaks (DSBs) are a potent initiator of recombination, but whether spontaneous events initiate with the breakage of one or both DNA strands remains unclear. In the current study, a crossover (CO)-specific assay was used to compare heteroduplex DNA (hetDNA) profiles, which reflect strand exchange intermediates, associated with DSB-induced versus spontaneous events in yeast. Most DSB-induced CO products had the two-sided hetDNA predicted by the canonical DSB repair model, with a switch in hetDNA position from one product to the other at the position of the break. Approximately 40% of COs, however, had hetDNA on only one side of the initiating break. This anomaly can be explained by a modified model in which there is frequent processing of an early invasion (D-loop) intermediate prior to extension of the invading end. Finally, hetDNA tracts exhibited complexities consistent with frequent expansion of the DSB into a gap, migration of strand-exchange junctions, and template switching during gap-filling reactions. hetDNA patterns in spontaneous COs isolated in either a wild-type background or in a background with elevated levels of reactive oxygen species (tsa1? mutant) were similar to those associated with the DSB-induced events, suggesting that DSBs are the major instigator of spontaneous mitotic recombination in yeast.
Since the discovery of the T cell receptor (TcR), immunologists have assigned somatic hypermutation (SHM) as a mechanism employed solely by B cells to diversify their antigen receptors. Remarkably, we found SHM acting in the thymus on a chain locus of shark TcR. SHM in developing shark T cells likely is catalyzed by activation-induced cytidine deaminase (AID) and results in both point and tandem mutations that accumulate non-conservative amino acid replacements within complementarity-determining regions (CDRs). Mutation frequency at TcRa was as high as that seen at B cell receptor loci (BcR) in sharks and mammals, and the mechanism of SHM shares unique characteristics first detected at shark BcR loci. Additionally, fluorescence in situ hybridization showed the strongest AID expression in thymic corticomedullary junction and medulla. We suggest that TcRa utilizes SHM to broaden diversification of the primary aß T cell repertoire in sharks, the first reported use in vertebrates.© 2018, Ott et al.
The chromosomes of many eukaryotes have regions of high GC content interspersed with regions of low GC content. In the yeast Saccharomyces cerevisiae, high-GC regions are often associated with high levels of meiotic recombination. In this study, we constructed URA3 genes that differ substantially in their base composition [URA3-AT (31% GC), URA3-WT (43% GC), and URA3-GC (63% GC)] but encode proteins with the same amino acid sequence. The strain with URA3-GC had an approximately sevenfold elevated rate of ura3 mutations compared with the strains with URA3-WT or URA3-AT About half of these mutations were single-base substitutions and were dependent on the error-prone DNA polymerase ?. About 30% were deletions or duplications between short (5-10 base) direct repeats resulting from DNA polymerase slippage. The URA3-GC gene also had elevated rates of meiotic and mitotic recombination relative to the URA3-AT or URA3-WT genes. Thus, base composition has a substantial effect on the basic parameters of genome stability and evolution. Copyright © 2018 the Author(s). Published by PNAS.
Retrotransposons are the major contributors to the expansion of the Drosophila ananassae Muller F element.
The discordance between genome size and the complexity of eukaryotes can partly be attributed to differences in repeat density. The Muller F element (~5.2 Mb) is the smallest chromosome in Drosophila melanogaster, but it is substantially larger (>18.7 Mb) in D. ananassae To identify the major contributors to the expansion of the F element and to assess their impact, we improved the genome sequence and annotated the genes in a 1.4-Mb region of the D. ananassae F element, and a 1.7-Mb region from the D element for comparison. We find that transposons (particularly LTR and LINE retrotransposons) are major contributors to this expansion (78.6%), while Wolbachia sequences integrated into the D. ananassae genome are minor contributors (0.02%). Both D. melanogaster and D. ananassae F-element genes exhibit distinct characteristics compared to D-element genes (e.g., larger coding spans, larger introns, more coding exons, and lower codon bias), but these differences are exaggerated in D. ananassae Compared to D. melanogaster, the codon bias observed in D. ananassae F-element genes can primarily be attributed to mutational biases instead of selection. The 5′ ends of F-element genes in both species are enriched in dimethylation of lysine 4 on histone 3 (H3K4me2), while the coding spans are enriched in H3K9me2. Despite differences in repeat density and gene characteristics, D. ananassae F-element genes show a similar range of expression levels compared to genes in euchromatic domains. This study improves our understanding of how transposons can affect genome size and how genes can function within highly repetitive domains. Copyright © 2017 Leung et al.
The utility of PacBio circular consensus sequencing for characterizing complex gene families in non-model organisms.
Molecular characterization of highly diverse gene families can be time consuming, expensive, and difficult, especially when considering the potential for relatively large numbers of paralogs and/or pseudogenes. Here we investigate the utility of Pacific Biosciences single molecule real-time (SMRT) circular consensus sequencing (CCS) as an alternative to traditional cloning and Sanger sequencing PCR amplicons for gene family characterization. We target vomeronasal gene receptors, one of the most diverse gene families in mammals, with the goal of better understanding intra-specific V1R diversity of the gray mouse lemur (Microcebus murinus). Our study compares intragenomic variation for two V1R subfamilies found in the mouse lemur. Specifically, we compare gene copy variation within and between two individuals of M. murinus as characterized by different methods for nucleotide sequencing. By including the same individual animal from which the M. murinus draft genome was derived, we are able to cross-validate gene copy estimates from Sanger sequencing versus CCS methods.We generated 34,088 high quality circular consensus sequences of two diverse V1R subfamilies (here referred to as V1RI and V1RIX) from two individuals of Microcebus murinus. Using a minimum threshold of 7× coverage, we recovered approximately 90% of V1RI sequences previously identified in the draft M. murinus genome (59% being identical at all nucleotide positions). When low coverage sequences were considered (i.e. < 7× coverage) 100% of V1RI sequences identified in the draft genome were recovered. At least 13 putatively novel V1R loci were also identified using CCS technology.Recent upgrades to the Pacific Biosciences RS instrument have improved the CCS technology and offer an alternative to traditional sequencing approaches. Our results suggest that the Microcebus murinus V1R repertoire has been underestimated in the draft genome. In addition to providing an improved understanding of V1R diversity in the mouse lemur, this study demonstrates the utility of CCS technology for characterizing complex regions of the genome. We anticipate that long-read sequencing technologies such as PacBio SMRT will allow for the assembly of multigene family clusters and serve to more accurately characterize patterns of gene copy variation in large gene families, thus revealing novel micro-evolutionary patterns within non-model organisms.
Despite modern sequencing efforts, the difficulty in assembly of highly repetitive sequences has prevented resolution of human genome gaps, including some in the coding regions of genes with important biological functions. One such gene, MUC5AC, encodes a large, secreted mucin, which is one of the two major secreted mucins in human airways. The MUC5AC region contains a gap in the human genome reference (hg19) across the large, highly repetitive, and complex central exon. This exon is predicted to contain imperfect tandem repeat sequences and multiple conserved cysteine-rich (CysD) domains. To resolve the MUC5AC genomic gap, we used high-fidelity long PCR followed by single molecule real-time (SMRT) sequencing. This technology yielded long sequence reads and robust coverage that allowed for de novo sequence assembly spanning the entire repetitive region. Furthermore, we used SMRT sequencing of PCR amplicons covering the central exon to identify genetic variation in four individuals. The results demonstrated the presence of segmental duplications of CysD domains, insertions/deletions (indels) of tandem repeats, and single nucleotide variants. Additional studies demonstrated that one of the identified tandem repeat insertions is tagged by nonexonic single nucleotide polymorphisms. Taken together, these data illustrate the successful utility of SMRT sequencing long reads for de novo assembly of large repetitive sequences to fill the gaps in the human genome. Characterization of the MUC5AC gene and the sequence variation in the central exon will facilitate genetic and functional studies for this critical airway mucin.