In the past two decades, Chinese scientists have achieved significant progress on three aspects of wheat genetic transformation. First, the wheat transformation platform has been established and optimized to improve the transformation efficiency, shorten the time required from starting of transformation procedure to the fertile transgenic wheat plants obtained as well as to overcome the problem of genotype-dependent for wheat genetic transformation in wide range of wheat elite varieties. Second, with the help of many emerging techniques such as CRISPR/cas9 function of over 100 wheat genes has been investigated. Finally, modern technology has been combined with the traditional breeding technique…
Viruses of the subfamily Orthoretrovirinaeare defined by the ability to reverse transcribe an RNA genome into DNA that integrates into the host cell genome during the intracellular virus life cycle. Exogenous retroviruses (XRVs) are horizontally transmitted between host individuals, with disease outcome depending on interactions between the retrovirus and the host organism. When retroviruses infect germ line cells of the host, they may become endogenous retroviruses (ERVs), which are permanent elements in the host germ line that are subject to vertical transmission. These ERVs sometimes remain infectious and can themselves give rise to XRVs. This review integrates recent developments in…
Skeletal muscle is ideal for passive vaccine administration as it is easily accessible by intramuscular injection. Recombinant adeno-associated virus (rAAV) vectors are in consideration for passive vaccination clinical trials for HIV and influenza. However, greater human skeletal muscle transduction is needed for therapeutic efficacy than is possible with existing serotypes. To bioengineer capsids with therapeutic levels of transduction, we utilized a directed evolution approach to screen libraries of shuffled AAV capsids in pools of surgically resected human skeletal muscle cells from five patients. Six rounds of evolution were performed in various muscle cell types, and evolved variants were validated against…
The lungs of Cystic fibrosis (CF) patients are often colonized and/or infected by Staphylococcus aureus for years, mostly by one predominant clone. For long-term survival in this environment, S. aureus needs to adapt during its interactions with host factors, antibiotics, and other pathogens. Here, we study long-term transcriptional as well as genomic adaptations of an isogenic pair of S. aureus isolates from a single patient using RNA sequencing (RNA-Seq) and whole genome sequencing (WGS). Mimicking in vivo conditions, we cultivated the S. aureus isolates using artificial sputum medium before harvesting RNA for subsequent analysis. We confirmed our RNA-Seq data using…
Endogenous retroviruses (ERVs) occupy extensive regions of the human genome. Although many of these retroviral elements have lost their ability to replicate, those whose insertion took place more recently, such as the HML-2 group of HERV-K elements, still retain intact open reading frames and the capacity to produce certain viral RNA and/or proteins. Transcription of these ERVs is, however, tightly regulated by dedicated epigenetic control mechanisms. Nonetheless, it has been reported that some pathologic states, such as viral infections and certain cancers, coincide with ERV expression suggesting transcriptional reawakening is possible. HML-2 elements are reportedly induced during HIV-1 infection, but…
CDC-like kinase phosphorylation of serine/arginine-rich proteins is central to RNA splicing reactions. Yet, the genomic network of CDC-like kinase-dependent RNA processing events remains poorly defined. Here, we explore the connectivity of genomic CDC-like kinase splicing functions by applying graduated, short-exposure, pharmacological CDC-like kinase inhibition using a novel small molecule (T3) with very high potency, selectivity, and cell-based stability. Using RNA-Seq, we define CDC-like kinase-responsive alternative splicing events, the large majority of which monotonically increase or decrease with increasing CDC-like kinase inhibition. We show that distinct RNA-binding motifs are associated with T3 response in skipped exons. Unexpectedly, we observe dose-dependent conjoined…
MHC-E is a highly conserved nonclassical MHC class Ib molecule that predominantly binds and presents MHC class Ia leader sequence-derived peptides for NK cell regulation. However, MHC-E also binds pathogen-derived peptide Ags for presentation to CD8+ T cells. Given this role in adaptive immunity and its highly monomorphic nature in the human population, HLA-E is an attractive target for novel vaccine and immunotherapeutic modalities. Development of HLA-E-targeted therapies will require a physiologically relevant animal model that recapitulates HLA-E-restricted T cell biology. In this study, we investigated MHC-E immunobiology in two common nonhuman primate species, Indian-origin rhesus macaques (RM) and Mauritian-origin…
Structural variation and single-nucleotide variation of the complement factor H (CFH) gene family underlie several complex genetic diseases, including age-related macular degeneration (AMD) and atypical hemolytic uremic syndrome (AHUS). To understand its diversity and evolution, we performed high-quality sequencing of this ~360-kbp locus in six primate lineages, including multiple human haplotypes. Comparative sequence analyses reveal two distinct periods of gene duplication leading to the emergence of four CFH-related (CFHR) gene paralogs (CFHR2 and CFHR4 ~25-35 Mya and CFHR1 and CFHR3 ~7-13 Mya). Remarkably, all evolutionary breakpoints share a common ~4.8-kbp segment corresponding to an ancestral CFHR gene promoter that has…
Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio’s single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT…
Long INterspersed Element-1 (LINE-1 or L1) is the only autonomously active, transposable element in the human genome. L1 sequences comprise approximately 17 % of the human genome, but only the evolutionarily recent, human-specific subfamily is retrotransposition competent. The L1 promoter has a bidirectional orientation containing a sense promoter that drives the transcription of two proteins required for retrotransposition and an antisense promoter. The L1 antisense promoter can drive transcription of chimeric transcripts: 5′ L1 antisense sequences spliced to the exons of neighboring genes.The impact of L1 antisense promoter activity on cellular transcriptomes is poorly understood. To investigate this, we analyzed GenBank…
Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their…
L1 elements represent the only currently active, autonomous retrotransposon in the human genome, and they make major contributions to human genetic instability. The vast majority of the 500 000 L1 elements in the genome are defective, and only a relatively few can contribute to the retrotransposition process. However, there is currently no comprehensive approach to identify the specific loci that are actively transcribed separate from the excess of L1-related sequences that are co-transcribed within genes. We have developed RNA-Seq procedures, as well as a 1200 bp 5? RACE product coupled with PACBio sequencing that can identify the specific L1 loci…
Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93?Gb (contig N50: 8.3?Mb, scaffold N50: 22.0?Mb, including 39.3?Mb N-bases), together with 206?Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8?Mb of HX1-specific sequences, including 4.1?Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing…
While some human-specific protein-coding genes have been proposed to originate from ancestral lncRNAs, the transition process remains poorly understood. Here we identified 64 hominoid-specific de novo genes and report a mechanism for the origination of functional de novo proteins from ancestral lncRNAs with precise splicing structures and specific tissue expression profiles. Whole-genome sequencing of dozens of rhesus macaque animals revealed that these lncRNAs are generally not more selectively constrained than other lncRNA loci. The existence of these newly-originated de novo proteins is also not beyond anticipation under neutral expectation, as they generally have longer theoretical lifespan than their current age,…
Gliadins, specified by six compound chromosomal loci (Gli-A1/B1/D1 and Gli-A2/B2/D2) in hexaploid bread wheat, are the dominant carriers of celiac disease (CD) epitopes. Because of their complexity, genome-wide characterization of gliadins is a strong challenge. Here, we approached this challenge by combining transcriptomic, proteomic and bioinformatic investigations. Through third-generation RNA sequencing, full-length transcripts were identified for 52 gliadin genes in the bread wheat cultivar Xiaoyan 81. Of them, 42 were active and predicted to encode 25 a-, 11 ?-, one d- and five ?-gliadins. Comparative proteomic analysis between Xiaoyan 81 and six newly-developed mutants each lacking one Gli locus indicated…