Menu
July 7, 2019

Map-based cloning of the fertility restoration locus Rfm1 in cultivated barley (Hordeum vulgare)

Hybridization technology has proven valuable in enhancing yields in many crops, but was only recently adopted in the small grain cereals. Hybrid varieties in barley (Hordeum vulgare) rely on the cytoplasmic male sterility (CMS) system msm1 derived from Hordeum vulgare ssp. spontaneum. The major restorer gene described for the msm1 system is known as Rfm1 and maps to the top of chromosome 6H. To gain further insight into mechanisms underlying male fertility restoration in barley, we used a map-based cloning approach to identify the nuclear gene involved in the restoration mechanism of this hybridization system. Taking advantage of the available genomic resources in barley in combination with a custom-made non-gridded BAC library developed from a restorer line, we cloned and sequenced the Rfm1 restorer locus. The characterization and annotation of the nucleotide sequence for the Rfm1 restorer allele allowed for the identification of the candidate gene for Rfm1. The Rfm1 locus carries a tandem repeat of a gene encoding a pentatricopeptide repeat (PPR) protein. Surprisingly, Rfm1 belongs to the PLS-DYW subfamily of PPR genes known for their involvement in RNA editing in plants organelles, but that to date have not been identified as restorer genes.


July 7, 2019

Genome-wide epigenetic studies in chicken: A review

Over the years, farmed birds have been selected on various performance traits mainly through genetic selection. However, many studies have shown that genetics may not be the sole contributor to phenotypic plasticity. Gene expression programs can be influenced by environmentally induced epigenetic changes that may alter the phenotypes of the developing animals. Recently, high-throughput sequencing techniques became sufficiently affordable thanks to technological advances to study whole epigenetic landscapes in model plants and animals. In birds, a growing number of studies recently took advantage of these techniques to gain insights into the epigenetic mechanisms of gene regulation in processes such as immunity or environmental adaptation. Here, we review the current gain of knowledge on the chicken epigenome made possible by recent advances in high-throughput sequencing techniques by focusing on the two most studied epigenetic modifications, DNA methylation and histone post-translational modifications. We discuss and provide insights about designing and performing analyses to further explore avian epigenomes. A better understanding of the molecular mechanisms underlying the epigenetic regulation of gene expression in relation to bird phenotypes may provide new knowledge and markers that should undoubtedly contribute to a sustainable poultry production.


July 7, 2019

Genome sequencing brought Gossypium biology research into a new era.

The first sequenced diploid cotton genome was published in 2012 by the group led by the Institute of Cotton Research, Chinese Academy of Agricultural Sciences. Cotton genomics research subsequently entered a period of rapid development. The accumulating data have provided new insights into the evolution and domestication of cotton, the development of important agronomic traits, and strategies for improving cotton quality and production.


July 7, 2019

Ultraaccurate genome sequencing and haplotyping of single human cells.

Accurate detection of variants and long-range haplotypes in genomes of single human cells remains very challenging. Common approaches require extensive in vitro amplification of genomes of individual cells using DNA polymerases and high-throughput short-read DNA sequencing. These approaches have two notable drawbacks. First, polymerase replication errors could generate tens of thousands of false-positive calls per genome. Second, relatively short sequence reads contain little to no haplotype information. Here we report a method, which is dubbed SISSOR (single-stranded sequencing using microfluidic reactors), for accurate single-cell genome sequencing and haplotyping. A microfluidic processor is used to separate the Watson and Crick strands of the double-stranded chromosomal DNA in a single cell and to randomly partition megabase-size DNA strands into multiple nanoliter compartments for amplification and construction of barcoded libraries for sequencing. The separation and partitioning of large single-stranded DNA fragments of the homologous chromosome pairs allows for the independent sequencing of each of the complementary and homologous strands. This enables the assembly of long haplotypes and reduction of sequence errors by using the redundant sequence information and haplotype-based error removal. We demonstrated the ability to sequence single-cell genomes with error rates as low as 10-8and average 500-kb-long DNA fragments that can be assembled into haplotype contigs with N50 greater than 7 Mb. The performance could be further improved with more uniform amplification and more accurate sequence alignment. The ability to obtain accurate genome sequences and haplotype information from single cells will enable applications of genome sequencing for diverse clinical needs. Copyright © 2017 the Author(s). Published by PNAS.


July 7, 2019

Trajectories and drivers of genome evolution in surface-associated marine Phaeobacter.

The extent of genome divergence and the evolutionary events leading to speciation of marine bacteria have mostly been studied for (locally) abundant, free-living groups. The genus Phaeobacter is found on different marine surfaces, seems to occupy geographically disjunct habitats, and is involved in different biotic interactions, and was therefore targeted in the present study. The analysis of the chromosomes of 32 closely related but geographically spread Phaeobacter strains revealed an exceptionally large, highly syntenic core genome. The flexible gene pool is constantly but slightly expanding across all Phaeobacter lineages. The horizontally transferred genes mostly originated from bacteria of the Roseobacter group and horizontal transfer most likely was mediated by gene transfer agents. No evidence for geographic isolation and habitat specificity of the different phylogenomic Phaeobacter clades was detected based on the sources of isolation. In contrast, the functional gene repertoire and physiological traits of different phylogenomic Phaeobacter clades were sufficiently distinct to suggest an adaptation to an associated lifestyle with algae, to additional nutrient sources, or toxic heavy metals. Our study reveals that the evolutionary trajectories of surface-associated marine bacteria can differ significantly from free-living marine bacteria or marine generalists.© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

Disease onset in X-linked dystonia-parkinsonism correlates with expansion of a hexameric repeat within an SVA retrotransposon in TAF1.

X-linked dystonia-parkinsonism (XDP) is a neurodegenerative disease associated with an antisense insertion of a SINE-VNTR-Alu (SVA)-type retrotransposon within an intron ofTAF1This unique insertion coincides with six additional noncoding sequence changes inTAF1, the gene that encodes TATA-binding protein-associated factor-1, which appear to be inherited together as an identical haplotype in all reported cases. Here we examined the sequence of this SVA in XDP patients (n= 140) and detected polymorphic variation in the length of a hexanucleotide repeat domain, (CCCTCT)nThe number of repeats in these cases ranged from 35 to 52 and showed a highly significant inverse correlation with age at disease onset. Because other SVAs exhibit intrinsic promoter activity that depends in part on the hexameric domain, we assayed the transcriptional regulatory effects of varying hexameric lengths found in the unique XDP SVA retrotransposon using luciferase reporter constructs. When inserted sense or antisense to the luciferase reading frame, the XDP variants repressed or enhanced transcription, respectively, to an extent that appeared to vary with length of the hexamer. Further in silico analysis of this SVA sequence revealed multiple motifs predicted to form G-quadruplexes, with the greatest potential detected for the hexameric repeat domain. These data directly link sequence variation within the XDP-specific SVA sequence to phenotypic variability in clinical disease manifestation and provide insight into potential mechanisms by which this intronic retroelement may induce transcriptional interference inTAF1expression. Copyright © 2017 the Author(s). Published by PNAS.


July 7, 2019

On the importance of homology in the age of phylogenomics

Homology is perhaps the most central concept of phylogenetic biology. Molecular systematists have traditionally paid due attention to the homology statements that are implied by their alignments of orthologous sequences, but some authors have suggested that manual gene-by-gene curation is not sustainable in the phylogenomics era. Here, we show that there are multiple ways to efficiently screen for and detect homology errors in phylogenomic data sets. Application of these screening approaches to two phylogenomic data sets, one for birds and another for mammals, shows that these data are replete with homology errors including alignments of different exons to each other, alignments of exons to introns, and alignments of paralogues to each other. The extent of these homology errors weakens the conclusions of studies based on these data sets. Despite advances in automated phylogenomic pipelines, we contend that much of the long, difficult, and sometimes tedious work of systematics is still required to guard against pervasive homology errors. This conclusion is underscored by recent studies that show that just a few outlier genes can impact phylogenetic results at short, tightly spaced internodes that are deep in the Tree of Life. The view that widespread DNA sequence alignment errors are not a major concern for rigorous systematic research is not tenable. If a primary goal of phylogenomics is to resolve the most challenging phylogenetic problems with the abundant data that are now available, researchers must employ effective procedures to screen for and correct homology errors prior to performing downstream phylogenetic analyses.


July 7, 2019

Comparative whole-genomic analysis of an ancient L2 lineage Mycobacterium novel phylogenetic clade and common genetic determinants of hypervirulent strains.

Background: Development of improved therapeutics against tuberculosis (TB) is hindered by an inadequate understanding of the relationship between disease severity and genetic diversity of its causative agent, Mycobacterium tuberculosis. We previously isolated a hypervirulent M. tuberculosis strain H112 from an HIV-negative patient with an aggressive disease progression from pulmonary TB to tuberculous meningitis—the most severe manifestation of tuberculosis. Human macrophage challenge experiment demonstrated that the strain H112 exhibited significantly better intracellular survivability and induced lower level of TNF-a than the reference virulent strain H37Rv and other 123 clinical isolates. Aim: The present study aimed to identify the potential genetic determinants of mycobacterial virulence that were common to strain H112 and hypervirulent M. tuberculosis strains of the same phylogenetic clade isolated in other global regions. Methods: A low-virulent M. tuberculosis strain H54 which belonged to the same phylogenetic lineage (L2) as strain H112 was selected from a collection of 115 clinical isolates. Both H112 and H54 were whole-genome-sequenced using PacBio sequencing technology. A comparative genomics approach was adopted to identify mutations present in strain H112 but absent in strain H54. Subsequently, an extensive phylogenetic analysis was conducted by including all publically available M. tuberculosis genomes. Single-nucleotide-polymorphisms (SNPs) and structural variations (SVs) common to hypervirulent strains in the global collection of genomes were considered as potential genetic determinants of hypervirulence. Results: Sequencing data revealed that both H112 and H54 were identified as members of the same sub-lineage L2.2.1. After excluding the lineage-related mutations shared between H112 and H54, we analyzed the phylogenetic relatedness of H112 with global collection of M. tuberculosis genomes (n = 4,338), and identified a novel phylogenetic clade in which four hypervirulent strains isolated from geographically diverse regions were clustered together. All hypervirulent strains in the clade shared 12 SNPs and 5 SVs with H112, including those affecting key virulence-associated loci, notably, a deleterious SNP (rv0178 p. D150E) within mce1 operon and an intergenic deletion (854259_ 854261delCC) in close-proximity to phoP. Conclusion: The present study identified common genetic factors in a novel phylogenetic clade of hypervirulent M. tuberculosis. The causative role of these mutations in mycobacterial virulence should be validated in future study.


July 7, 2019

A recurrence-based approach for validating structural variation using long-read sequencing technology.

Although numerous algorithms have been developed to identify structural variations (SVs) in genomic sequences, there is a dearth of approaches that can be used to evaluate their results. This is significant as the accurate identification of structural variation is still an outstanding but important problem in genomics. The emergence of new sequencing technologies that generate longer sequence reads can, in theory, provide direct evidence for all types of SVs regardless of the length of the region through which it spans. However, current efforts to use these data in this manner require the use of large computational resources to assemble these sequences as well as visual inspection of each region. Here we present VaPoR, a highly efficient algorithm that autonomously validates large SV sets using long-read sequencing data. We assessed the performance of VaPoR on SVs in both simulated and real genomes and report a high-fidelity rate for overall accuracy across different levels of sequence depths. We show that VaPoR can interrogate a much larger range of SVs while still matching existing methods in terms of false positive validations and providing additional features considering breakpoint precision and predicted genotype. We further show that VaPoR can run quickly and efficiency without requiring a large processing or assembly pipeline. VaPoR provides a long read-based validation approach for genomic SVs that requires relatively low read depth and computing resources and thus will provide utility with targeted or low-pass sequencing coverage for accurate SV assessment. The VaPoR Software is available at: https://github.com/mills-lab/vapor.© The Authors 2017. Published by Oxford University Press.


July 7, 2019

De novo design and synthesis of a 30-cistron translation-factor module.

Two of the many goals of synthetic biology are synthesizing large biochemical systems and simplifying their assembly. While several genes have been assembled together by modular idempotent cloning, it is unclear if such simplified strategies scale to very large constructs for expression and purification of whole pathways. Here we synthesize from oligodeoxyribonucleotides a completely de-novo-designed, 58-kb multigene DNA. This BioBrick plasmid insert encodes 30 of the 31 translation factors of the PURE translation system, each His-tagged and in separate transcription cistrons. Dividing the insert between three high-copy expression plasmids enables the bulk purification of the aminoacyl-tRNA synthetases and translation factors necessary for affordable, scalable reconstitution of an in vitro transcription and translation system, PURE 3.0.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Unlocking the biological potential of Euglena gracilis: evolution, cell biology and significance to parasitism

Photosynthetic euglenids are major components of aquatic ecosystems and relatives of trypanosomes. Euglena gracilis has considerable biotechnological potential and great adaptability, but exploitation remains hampered by the absence of a comprehensive gene catalogue. We address this by genome, RNA and protein sequencing: the E. gracilis genome is >2Gb, with 36,526 predicted proteins. Large lineage-specific paralog families are present, with evidence for flexibility in environmental monitoring, divergent mechanisms for metabolic control, and novel solutions for adaptation to extreme environments. Contributions from photosynthetic eukaryotes to the nuclear genome, consistent with the shopping bag model are found, together with transitions between kinetoplastid and canonical systems. Control of protein expression is almost exclusively post-transcriptional. These data are a major advance in understanding the nuclear genomes of euglenids and provide a platform for investigating the contributions of E. gracilis and its relatives to the biosphere.


July 7, 2019

The plastid genome in Cladophorales green algae is encoded by hairpin chromosomes.

Virtually all plastid (chloroplast) genomes are circular double-stranded DNA molecules, typically between 100 and 200 kb in size and encoding circa 80-250 genes. Exceptions to this universal plastid genome architecture are very few and include the dinoflagellates, where genes are located on DNA minicircles. Here we report on the highly deviant chloroplast genome of Cladophorales green algae, which is entirely fragmented into hairpin chromosomes. Short- and long-read high-throughput sequencing of DNA and RNA demonstrated that the chloroplast genes of Boodlea composita are encoded on 1- to 7-kb DNA contigs with an exceptionally high GC content, each containing a long inverted repeat with one or two protein-coding genes and conserved non-coding regions putatively involved in replication and/or expression. We propose that these contigs correspond to linear single-stranded DNA molecules that fold onto themselves to form hairpin chromosomes. The Boodlea chloroplast genes are highly divergent from their corresponding orthologs, and display an alternative genetic code. The origin of this highly deviant chloroplast genome most likely occurred before the emergence of the Cladophorales, and coincided with an elevated transfer of chloroplast genes to the nucleus. A chloroplast genome that is composed only of linear DNA molecules is unprecedented among eukaryotes, and highlights unexpected variation in plastid genome architecture. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 7, 2019

Mechanisms of adaptive divergence and speciation in Littorina saxatilis: Integrating knowledge from ecology and genetics with new data emerging from genomic studies

New opportunities to understand marine speciation and evolution of local adaptation come with genomic approaches and with the development of comprehensive model systems. The marine snail Littorina saxatilis is one example of a developing marine model for investigating genetic mechanisms of rapid divergence and evolution in natural systems. This species is strongly polymorphic and shows formation of local ecotypes throughout its distribution. Support is strong for primary (in situ) and parallel formation of reproductively semi-isolated ecotypes with contact zones between heterogeneous intertidal microhabitats. This makes this species an ideal organism for gaining new insights into the interplay of divergent selection, gene flow and genetic drift during local adaptation and speciation. A relatively well-resolved draft genome and a genetic map describing 17 linkage groups (“chromosomes”) are key tools for investigating the role of structural genomic variation, such as inversions, gene duplications and translocations. Whole genome re-sequencing of pools of individuals and the first comprehensive study of a contact zone contribute direct information on selection and barriers to gene flow present in specific regions of the genome. Linking selection at the phenotypic level to patterns obser ved in the genome is under way by quantitative trait loci mapping and annotation of candidate genes, while the role of single mutations on individual fitness will have to await development of gene manipulation tools. The features of the snail system facilitate the study of local adaptation and speciation and its genomic basis, but the underlying evolutionary processes are expected to be similar in other organisms, and hence this species is a useful model.


July 7, 2019

Genome sequence-based marker development and genotyping in potato

Potato (Solanum tuberosum L.) is one of the world’s most economically important food crops and holds major significance for future food security. Despite its importance, the study of potato genetics and breeding has lagged behind mainly due to its polyploid genome and high levels of heterozygosity. Conventional marker and genotyping approaches have been helpful in progressing potato genetic research but have also had limitations in exploiting the outcome from these studies for gene discovery and applied research applications. The sequencing of the potato genome, followed by advancements in marker and genotyping technologies, has brought a step change in the way potato genetic studies are conducted. Potato is now amenable to modern sequence-based marker and genotyping methods with their increased ability to put thousands of markers on any population of interest without a priori knowledge. This has increased the precision and resolution of genetic studies previously not feasible in potato. A diverse range of fixed and flexible genotyping platforms, for a wide variety of research and breeding applications, are now available. Concerted research efforts are now needed to screen the available genetic diversity for this important crop to identify novel and beneficial trait alleles in order to enable efficient and precise introgression breeding permitting breeding of climate smart, and resilient, potato cultivars. This chapter provides an overview of sequence-based marker development and genotyping methods along with their implications for potato research and breeding in the post-genomics era.


July 7, 2019

The state of whole-genome sequencing

Over the last decade, a technological paradigm shift has slashed the cost of DNA sequencing by over five orders of magnitude. Today, the cost of sequencing a human genome is a few thousand dollars, and it continues to fall. Here, we review the most cost-effective platforms for whole-genome sequencing (WGS) as well as emerging technologies that may displace or complement these. We also discuss the practical challenges of generating and analyzing WGS data, and how WGS has unlocked new strategies for discovering genes and variants underlying both rare and common human diseases.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.