2015 SMRT Informatics Developers Conference Presentation Slides: Shinichi Morishita of the University of Tokyo presented on how his team has been using SMRT Sequencing to better understand methylomes, metagenomes and structural variation of various eukaryotic genomes.
Genome-Wide Association Study of Growth and Body-Shape-Related Traits in Large Yellow Croaker (Larimichthys crocea) Using ddRAD Sequencing.
Large yellow croaker (Larimichthys crocea) is an economically important marine fish species of China. Due to overfishing and marine pollution, the wild stocks of this croaker have collapsed in the past decades. Meanwhile, the cultured croaker is facing the difficulties of reduced genetic diversity and low growth rate. To explore the molecular markers related to the growth traits of croaker and providing the related SNPs for the marker-assisted selection, we used double-digest restriction-site associated DNA (ddRAD) sequencing to dissect the genetic bases of growth traits in a cultured population and identify the SNPs that associated with important growth traits by GWAS. A total of 220 individuals were genotyped by ddRAD sequencing. After quality control, 27,227 SNPs were identified in 220 samples and used for GWAS analysis. We identified 13 genome-wide significant associated SNPs of growth traits on 8 chromosomes, and the beta P of these SNPs ranged from 0.01 to 0.86. Through the definition of candidate regions and gene annotation, candidate genes related to growth were identified, including important regulators such as fgf18, fgf1, nr3c1, cyp8b1, fabp2, cyp2r1, ppara, and ccm2l. We also identified SNPs and candidate genes that significantly associated with body shape, including bmp7, col1a1, col11a2, and col18a1, which are also economically important traits for large yellow croaker aquaculture. The results provided insights into the genetic basis of growth and body shape in large yellow croaker population and would provide reliable genetic markers for molecular marker-assisted selection in the future. Meanwhile, the result established a basis for our subsequent fine mapping and related gene study.
Forest tree species are increasingly subject to severe mortalities from exotic pests, diseases, and invasive organisms, accelerated by climate change. Forest health issues are threatening multiple species and ecosystem sustainability globally. While sources of resistance may be available in related species, or among surviving trees, introgression of resistance genes into threatened tree species in reasonable time frames requires genome-wide breeding tools. Asian species of chestnut (Castanea spp.) are being employed as donors of disease resistance genes to restore native chestnut species in North America and Europe. To aid in the restoration of threatened chestnut species, we present the assembly of a reference genome with chromosome-scale sequences for Chinese chestnut (C. mollissima), the disease-resistance donor for American chestnut restoration. We also demonstrate the value of the genome as a platform for research and species restoration, including new insights into the evolution of blight resistance in Asian chestnut species, the locations in the genome of ecologically important signatures of selection differentiating American chestnut from Chinese chestnut, the identification of candidate genes for disease resistance, and preliminary comparisons of genome organization with related species.
Morphological and genomic characterisation of the hybrid schistosome infecting humans in Europe reveals a complex admixture between Schistosoma haematobium and Schistosoma bovis parasites
Schistosomes cause schistosomiasis, the worldtextquoterights second most important parasitic disease after malaria. A peculiar feature of schistosomes is their ability to produce viable and fertile hybrids. Originally only present in the tropics, schistosomiasis is now also endemic in Europe. Based on two genetic markers the European species had been identified as a hybrid between the ruminant-infective Schistosoma bovis and the human-infective Schistosoma haematobium.Here we describe for the first time the genomic composition of the European schistosome hybrid (77% of S. haematobium and 23% of S. bovis origins), its morphometric parameters and its compatibility with the European vector snail and intermediate host Compatibility is a key parameter for the parasites life cycle progression. We also show that egg morphology (a classical diagnostic parameter) does not allow for differential diagnosis while genetic tests do so. Additionally, we performed genome assembly improvement and annotation of S. bovis, the parental species for which no satisfactory genome assembly was available.For the first time since the discovery of hybrid schistosomes, these results reveal at the whole genomic level a complex admixture of parental genomes highlighting (i) the high permeability of schistosomes to other speciestextquoteright alleles, and (ii) the importance of hybrid formation for pushing species boundaries not only conceptionally but also geographically.
Optimized Cas9 expression systems for highly efficient Arabidopsis genome editing facilitate isolation of complex alleles in a single generation.
Genetic resources for the model plant Arabidopsis comprise mutant lines defective in almost any single gene in reference accession Columbia. However, gene redundancy and/or close linkage often render it extremely laborious or even impossible to isolate a desired line lacking a specific function or set of genes from segregating populations. Therefore, we here evaluated strategies and efficiencies for the inactivation of multiple genes by Cas9-based nucleases and multiplexing. In first attempts, we succeeded in isolating a mutant line carrying a 70 kb deletion, which occurred at a frequency of ~?1.6% in the T2 generation, through PCR-based screening of numerous individuals. However, we failed to isolate a line lacking Lhcb1 genes, which are present in five copies organized at two loci in the Arabidopsis genome. To improve efficiency of our Cas9-based nuclease system, regulatory sequences controlling Cas9 expression levels and timing were systematically compared. Indeed, use of DD45 and RPS5a promoters improved efficiency of our genome editing system by approximately 25-30-fold in comparison to the previous ubiquitin promoter. Using an optimized genome editing system with RPS5a promoter-driven Cas9, putatively quintuple mutant lines lacking detectable amounts of Lhcb1 protein represented approximately 30% of T1 transformants. These results show how improved genome editing systems facilitate the isolation of complex mutant alleles, previously considered impossible to generate, at high frequency even in a single (T1) generation.
Genome sequence analysis of 91 Salmonella Enteritidis isolates from mice caught on poultry farms in the mid 1990s.
A total of 91 draft genome sequences were used to analyze isolates of Salmonella enterica serovar Enteritidis obtained from feral mice caught on poultry farms in Pennsylvania. One objective was to find mutations disrupting open reading frames (ORFs) and another was to determine if ORF-disruptive mutations were present in isolates obtained from other sources. A total of 83 mice were obtained between 1995-1998. Isolates separated into two genomic clades and 12 subgroups due to 742 mutations. Nineteen ORF-disruptive mutations were found, and in addition, bigA had exceptional heterogeneity requiring additional evaluation. The TRAMS algorithm detected only 6 ORF disruptions. The sefD mutation was the most frequently encountered mutation and it was prevalent in human, poultry, environmental and mouse isolates. These results confirm previous assessments of the mouse as a rich source of Salmonella enterica serovar Enteritidis that varies in genotype and phenotype. Copyright © 2019. Published by Elsevier Inc.
A high-quality genome sequence of any model organism is an essential starting point for genetic and other studies. Older clone-based methods are slow and expensive, whereas faster, cheaper short-read-only assemblies can be incomplete and highly fragmented, which minimizes their usefulness. The last few years have seen the introduction of many new technologies for genome assembly. These new technologies and associated new algorithms are typically benchmarked on microbial genomes or, if they scale appropriately, on larger (e.g., human) genomes. However, plant genomes can be much more repetitive and larger than the human genome, and plant biochemistry often makes obtaining high-quality DNA that is free from contaminants difficult. Reflecting their challenging nature, we observe that plant genome assembly statistics are typically poorer than for vertebrates.Here, we compare Illumina short read, Pacific Biosciences long read, 10x Genomics linked reads, Dovetail Hi-C, and BioNano Genomics optical maps, singly and combined, in producing high-quality long-range genome assemblies of the potato species Solanum verrucosum. We benchmark the assemblies for completeness and accuracy, as well as DNA compute requirements and sequencing costs.The field of genome sequencing and assembly is reaching maturity, and the differences we observe between assemblies are surprisingly small. We expect that our results will be helpful to other genome projects, and that these datasets will be used in benchmarking by assembly algorithm developers. © The Author(s) 2019. Published by Oxford University Press.
Finding the needle in a haystack: Mapping antifungal drug resistance in fungal pathogen by genomic approaches.
Fungi are ubiquitous on earth and are essential for the maintenance of the global ecological equilibrium. Despite providing benefits to living organisms, they can also target specific hosts and inflict damage. These fungal pathogens are known to affect, for example, plants and mam- mals and thus reduce crop production necessary to sustain food supply and cause mortality in humans and animals. Designing defenses against these fungi is essential for the control of food resources and human health. As far as fungal pathogens are concerned, the principal option has been the use of antifungal agents, also called fungicides when they are used in the environment.
A chromosome-level sequence assembly reveals the structure of the Arabidopsis thaliana Nd-1 genome and its gene set.
In addition to the BAC-based reference sequence of the accession Columbia-0 from the year 2000, several short read assemblies of THE plant model organism Arabidopsis thaliana were published during the last years. Also, a SMRT-based assembly of Landsberg erecta has been generated that identified translocation and inversion polymorphisms between two genotypes of the species. Here we provide a chromosome-arm level assembly of the A. thaliana accession Niederzenz-1 (AthNd-1_v2c) based on SMRT sequencing data. The best assembly comprises 69 nucleome sequences and displays a contig length of up to 16 Mbp. Compared to an earlier Illumina short read-based NGS assembly (AthNd-1_v1), a 75 fold increase in contiguity was observed for AthNd-1_v2c. To assign contig locations independent from the Col-0 gold standard reference sequence, we used genetic anchoring to generate a de novo assembly. In addition, we assembled the chondrome and plastome sequences. Detailed analyses of AthNd-1_v2c allowed reliable identification of large genomic rearrangements between A. thaliana accessions contributing to differences in the gene sets that distinguish the genotypes. One of the differences detected identified a gene that is lacking from the Col-0 gold standard sequence. This de novo assembly extends the known proportion of the A. thaliana pan-genome.
Improving traits in wheat has historically been challenging due to its large and polyploid genome, limited genetic diversity and in-field phenotyping constraints. However, within recent years many of these barriers have been lowered. The availability of a chromosome-level assembly of the wheat genome now facilitates a step-change in wheat genetics and provides a common platform for resources, including variation data, gene expression data and genetic markers. The development of sequenced mutant populations and gene-editing techniques now enables the rapid assessment of gene function in wheat directly. The ability to alter gene function in a targeted manner will unmask the effects of homoeolog redundancy and allow the hidden potential of this polyploid genome to be discovered. New techniques to identify and exploit the genetic diversity within wheat wild relatives now enable wheat breeders to take advantage of these additional sources of variation to address challenges facing food production. Finally, advances in phenomics have unlocked rapid screening of populations for many traits of interest both in greenhouses and in the field. Looking forwards, integrating diverse data types, including genomic, epigenetic and phenomics data, will take advantage of big data approaches including machine learning to understand trait biology in wheat in unprecedented detail. © 2018 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
Endogenous pararetroviruses (EPRVs) are characterized in several plant genomes and their biological effects have been reported. In this study, hundreds of EPRV segments were identified in six Citrinae genomes. A total of 1034 EPRV segments were identified in the genomes of sweet orange, 2036 in pummelo, 598 in clementine mandarin, 752 in Ichang papeda, 2060 in citron and 245 in atalantia. Genomic analysis indicated that EPRV segments tend to cluster as hot spots in the genomes, particularly on chromosome 2 and 5. Large numbers of simple repeats and transposable elements were identified in the 2-kb flanking regions of the EPRV segments. Comparative genomic analysis and PCR experiments showed that there are highly conserved EPRV segments and species-specific EPRV segments between the Citrinae genomes. Phylogenetic analysis suggested that the integration events of EPRVs could initiate in a common progenitor of Citrinae species and repeatedly occur during the Citrinae divergence.Copyright © 2018 Elsevier B.V. All rights reserved.
Genetic map-guided genome assembly reveals a virulence-governing minichromosome in the lentil anthracnose pathogen Colletotrichum lentis.
Colletotrichum lentis causes anthracnose, which is a serious disease on lentil and can account for up to 70% crop loss. Two pathogenic races, 0 and 1, have been described in the C. lentis population from lentil. To unravel the genetic control of virulence, an isolate of the virulent race 0 was sequenced at 1481-fold genomic coverage. The 56.10-Mb genome assembly consists of 50 scaffolds with N50 scaffold length of 4.89 Mb. A total of 11 436 protein-coding gene models was predicted in the genome with 237 coding candidate effectors, 43 secondary metabolite biosynthetic enzymes and 229 carbohydrate-active enzymes (CAZymes), suggesting a contraction of the virulence gene repertoire in C. lentis. Scaffolds were assigned to 10 core and two minichromosomes using a population (race 0 × race 1, n = 94 progeny isolates) sequencing-based, high-density (14 312 single nucleotide polymorphisms) genetic map. Composite interval mapping revealed a single quantitative trait locus (QTL), qClVIR-11, located on minichromosome 11, explaining 85% of the variability in virulence of the C. lentis population. The QTL covers a physical distance of 0.84 Mb with 98 genes, including seven candidate effector and two secondary metabolite genes. Taken together, the study provides genetic and physical evidence for the existence of a minichromosome controlling the C. lentis virulence on lentil. © 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.
Morella rubra, red bayberry, is an economically important fruit tree in south China. Here, we assembled the first high-quality genome for both a female and a male individual of red bayberry. The genome size was 313-Mb, and 90% sequences were assembled into eight pseudo chromosome molecules, with 32 493 predicted genes. By whole-genome comparison between the female and male and association analysis with sequences of bulked and individual DNA samples from female and male, a 59-Kb region determining female was identified and located on distal end of pseudochromosome 8, which contains abundant transposable element and seven putative genes, four of them are related to sex floral development. This 59-Kb female-specific region was likely to be derived from duplication and rearrangement of paralogous genes and retained non-recombinant in the female-specific region. Sex-specific molecular markers developed from candidate genes co-segregated with sex in a genetically diverse female and male germplasm. We propose sex determination follow the ZW model of female heterogamety. The genome sequence of red bayberry provides a valuable resource for plant sex chromosome evolution and also provides important insights for molecular biology, genetics and modern breeding in Myricaceae family. © 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Mutation and recombination are key evolutionary processes governing phenotypic variation and reproductive isolation. We here demonstrate that biodiversity within all globally known strains of Schizosaccharomyces pombe arose through admixture between two divergent ancestral lineages. Initial hybridization was inferred to have occurred ~20-60 sexual outcrossing generations ago consistent with recent, human-induced migration at the onset of intensified transcontinental trade. Species-wide heritable phenotypic variation was explained near-exclusively by strain-specific arrangements of alternating ancestry components with evidence for transgressive segregation. Reproductive compatibility between strains was likewise predicted by the degree of shared ancestry. To assess the genetic determinants of ancestry block distribution across the genome, we characterized the type, frequency, and position of structural genomic variation using nanopore and single-molecule real-time sequencing. Despite being associated with double-strand break initiation points, over 800 segregating structural variants exerted overall little influence on the introgression landscape or on reproductive compatibility between strains. In contrast, we found strong ancestry disequilibrium consistent with negative epistatic selection shaping genomic ancestry combinations during the course of hybridization. This study provides a detailed, experimentally tractable example that genomes of natural populations are mosaics reflecting different evolutionary histories. Exploiting genome-wide heterogeneity in the history of ancestral recombination and lineage-specific mutations sheds new light on the population history of S. pombe and highlights the importance of hybridization as a creative force in generating biodiversity. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The human disease lymphatic filariasis causes the debilitating effects of elephantiasis and hydrocele. Lymphatic filariasis currently affects the lives of 90 million people in 52 countries. There are three nematodes that cause lymphatic filariasis, Brugia malayi, Brugia timori, and Wuchereria bancrofti, but 90% of all cases of lymphatic filariasis are caused solely by W. bancrofti (Wb). Here we use population genomics to reconstruct the probable route and timing of migration of Wb strains that currently infect Africa, Haiti, and Papua New Guinea (PNG). We used selective whole genome amplification to sequence 42 whole genomes of single Wb worms from populations in Haiti, Mali, Kenya, and PNG. Our results are consistent with a hypothesis of an Island Southeast Asia or East Asian origin of Wb. Our demographic models support divergence times that correlate with the migration of human populations. We hypothesize that PNG was infected at two separate times, first by the Melanesians and later by the migrating Austronesians. The migrating Austronesians also likely introduced Wb to Madagascar where later migrations spread it to continental Africa. From Africa, Wb spread to the New World during the transatlantic slave trade. Genome scans identified 17 genes that were highly differentiated among Wb populations. Among these are genes associated with human immune suppression, insecticide sensitivity, and proposed drug targets. Identifying the distribution of genetic diversity in Wb populations and selection forces acting on the genome will build a foundation to test future hypotheses and help predict response to current eradication efforts. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: firstname.lastname@example.org.