Menu
April 21, 2020  |  

Hi-C guided assemblies reveal conserved regulatory topologies on X and autosomes despite extensive genome shuffling

Genome rearrangements that occur during evolution impose major challenges on regulatory mechanisms that rely on three-dimensional genome architecture. Here, we developed a scaffolding algorithm and generated chromosome-length assemblies from Hi-C data for studying genome topology in three distantly related Drosophila species. We observe extensive genome shuffling between these species with one synteny breakpoint after approximately every six genes. A/B compartments, a set of large gene-dense topologically associating domains (TADs) and spatial contacts between high-affinity sites (HAS) located on the X chromosome are maintained over 40 million years, indicating architectural conservation at various hierarchies. Evolutionary conserved genes cluster in the vicinity of HAS, while HAS locations appear evolutionarily flexible, thus uncoupling functional requirement of dosage compensation from individual positions on the linear X chromosome. Therefore, 3D architecture is preserved even in scenarios of thousands of rearrangements highlighting its relevance for essential processes such as dosage compensation of the X chromosome.


April 21, 2020  |  

Disruption of the kringle 1 domain of prothrombin leads to late onset mortality in zebrafish

The ability to prevent blood loss in response to injury is a critical, evolutionarily conserved function of all vertebrates. Prothrombin (F2) contributes to both primary and secondary hemostasis through the activation of platelets and the conversion of soluble fibrinogen to insoluble fibrin, respectively. Complete prothrombin deficiency has never been observed in humans and is incompatible with life in mice, limiting the ability to understand the entirety of prothrombin’s in vivo functions. We have previously demonstrated the ability of zebrafish to tolerate loss of both pro- and anticoagulant factors that are embryonic lethal in mammals, making them an ideal model for the study of prothrombin deficiency. Using genome editing with TALENs, we have generated a null allele in zebrafish f2. Homozygous mutant embryos develop normally into early adulthood, but demonstrate eventual complete mortality with the majority of fish succumbing to internal hemorrhage by 2 months of age. We show that despite the extended survival, the mutants are unable to form occlusive thrombi in both the venous and arterial systems as early as 3-5 days of life, and we were able to phenocopy this early hemostatic defect using direct oral anticoagulants. When the equivalent mutation was engineered into the homologous residues of human prothrombin, there were severe reductions in secretion and activation, suggesting a possible role for kringle 1 in thrombin maturation, and the possibility that the F1.2 fragment has a functional role in exerting the procoagulant effects of thrombin. Together, our data demonstrate the conserved function of thrombin in zebrafish, as well as the requirement for kringle 1 for biosynthesis and activation by prothrombinase. Understanding how zebrafish are able to develop normally and survive into early adulthood without prothrombin will provide important insight into its pleiotropic functions as well as the management of patients with bleeding disorders.


April 21, 2020  |  

Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline

Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and allow for annotation of TEs. There are numerous methods for each class of elements with unknown relative performance metrics. We benchmarked existing programs based on a curated library of rice TEs. Using the most robust programs, we created a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a condensed TE library for annotations of structurally intact and fragmented elements. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.List of abbreviationsTETransposable ElementsLTRLong Terminal RepeatLINELong Interspersed Nuclear ElementSINEShort Interspersed Nuclear ElementMITEMiniature Inverted Transposable ElementTIRTerminal Inverted RepeatTSDTarget Site DuplicationTPTrue PositivesFPFalse PositivesTNTrue NegativeFNFalse NegativesGRFGeneric Repeat FinderEDTAExtensive de-novo TE Annotator


April 21, 2020  |  

Genome analysis and Hi-C assisted assembly of Elaeagnus angustifolia L., a deciduous tree belonging to Elaeagnaceae

Elaeagnus angustifolia L. is a deciduous tree of the Elaeagnaceae family. It is widely used in the study of abiotic stress tolerance in plants and for the improvement of desertification-affected land due to its characteristics of drought resistance, salt tolerance, cold resistance, wind resistance, and other environmental adaptation. Here, we report the complete genome sequencing using the Pacific Biosciences (PacBio) platform and Hi-C assisted assembly of E. angustifolia. A total of 44.27 Gb raw PacBio sequel reads were obtained after filtering out low-quality data, with an average length of 8.64 Kb. Assembly using Canu gave an assembly length of 781.09 Mb, with a contig N50 of 486.92 Kb. A total of 39.56 Gb of clean reads was obtained, with a sequencing coverage of 75×, and Q30 ratio > 95.46%. The 510.71 Mb genomic sequence was mapped to the chromosome, accounting for 96.94% of the total length of the sequence, and the corresponding number of sequences was 269, accounting for 45.83% of the total number of sequences. The genome sequence study of E. angustifolia can be a valuable source for the comparative genome analysis of the Elaeagnaceae family members, and can help to understand the evolutionary response mechanisms of the Elaeagnaceae to drought, salt, cold and wind resistance, and thereby provide effective theoretical support for the improvement of desertification-affected land.


April 21, 2020  |  

Insect genomes: progress and challenges.

In the wake of constant improvements in sequencing technologies, numerous insect genomes have been sequenced. Currently, 1219 insect genome-sequencing projects have been registered with the National Center for Biotechnology Information, including 401 that have genome assemblies and 155 with an official gene set of annotated protein-coding genes. Comparative genomics analysis showed that the expansion or contraction of gene families was associated with well-studied physiological traits such as immune system, metabolic detoxification, parasitism and polyphagy in insects. Here, we summarize the progress of insect genome sequencing, with an emphasis on how this impacts research on pest control. We begin with a brief introduction to the basic concepts of genome assembly, annotation and metrics for evaluating the quality of draft assemblies. We then provide an overview of genome information for numerous insect species, highlighting examples from prominent model organisms, agricultural pests and disease vectors. We also introduce the major insect genome databases. The increasing availability of insect genomic resources is beneficial for developing alternative pest control methods. However, many opportunities remain for developing data-mining tools that make maximal use of the available insect genome resources. Although rapid progress has been achieved, many challenges remain in the field of insect genomics. © 2019 The Royal Entomological Society.


April 21, 2020  |  

Extended haplotype phasing of de novo genome assemblies with FALCON-Phase

Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. These assemblies can be created in various ways, such as use of tissues that contain single-haplotype (haploid) genomes, or by co-sequencing of parental genomes, but these approaches can be impractical in many situations. We present FALCON-Phase, which integrates long-read sequencing data and ultra-long-range Hi-C chromatin interaction data of a diploid individual to create high-quality, phased diploid genome assemblies. The method was evaluated by application to three datasets, including human, cattle, and zebra finch, for which high-quality, fully haplotype resolved assemblies were available for benchmarking. Phasing algorithm accuracy was affected by heterozygosity of the individual sequenced, with higher accuracy for cattle and zebra finch (>97%) compared to human (82%). In addition, scaffolding with the same Hi-C chromatin contact data resulted in phased chromosome-scale scaffolds.


April 21, 2020  |  

Divergent selection following speciation in two ectoparasitic honey bee mites

Multispecies host-parasite evolution is common, but how parasites evolve after speciating remains poorly understood. Shared evolutionary history and physiology may propel species along similar evolutionary trajectories whereas pursuing different strategies can reduce competition. We test these scenarios in the economically important association between honey bees and ectoparasitic mites by sequencing the genomes of the sister mite species Varroa destructor and Varroa jacobsoni. These genomes were closely related, with 99.7% sequence identity. Among the 9,628 orthologous genes, 4.8% showed signs of positive selection in at least one species. Divergent selective trajectories were discovered in conserved chemosensory gene families (IGR, SNMP), and Halloween genes (CYP) involved in moulting and reproduction. However, there was little overlap in these gene sets and associated GO terms, indicating different selective regimes operating on each of the parasites. Based on our findings, we suggest that species-specific strategies may be needed to combat evolving parasite communities.


April 21, 2020  |  

Virus-host coexistence in phytoplankton through the genomic lens

Phytoplankton-virus interactions are major determinants of geochemical cycles in the oceans. Viruses are responsible for the redirection of carbon and nutrients away from larger organisms back towards microorganisms via the lysis of microalgae in a process coined the “viral shunt”. Virus-host interactions are generally expected to follow “boom and bust” dynamics, whereby a numerically dominant strain is lysed and replaced by a virus resistant strain. Here, we isolated a microalga and its infective nucleo-cytoplasmic large DNA virus (NCLDV) concomitantly from the environment in the surface NW Mediterranean Sea, Ostreococcus mediterraneus, and show continuous growth in culture of both the microalga and the virus. Evolution experiments through single cell bottlenecks demonstrate that, in the absence of the virus, susceptible cells evolve from one ancestral resistant single cell, and vice-versa; that is that resistant cells evolve from one ancestral susceptible cell. This provides evidence that the observed sustained viral production is the consequence of a minority of virus-susceptible cells. The emergence of these cells is explained by low-level phase switching between virus-resistant and virus-susceptible phenotypes, akin to a bet hedging strategy. Whole genome sequencing and analysis of the ~14 Mb microalga and the ~200 kb virus points towards ancient speciation of the microalga within the Ostreococcus species complex and frequent gene exchanges between prasinoviruses infecting Ostreococcus species. Re-sequencing of one susceptible strain demonstrated that the phase switch involved a large 60 Kb deletion of one chromosome. This chromosome is an outlier chromosome compared to the streamlined, gene dense, GC-rich standard chromosomes, as it contains many repeats and few orthologous genes. While this chromosome has been described in three different genera, its size increments have been previously associated to antiviral immunity and resistance in another species from the same genus. Mathematical modelling of this mechanism predicts microalga-virus population dynamics consistent with the observation of continuous growth of both virus and microalga. Altogether, our results suggest a previously overlooked strategy in phytoplankton-virus interactions.


April 21, 2020  |  

Centromere-mediated chromosome break drives karyotype evolution in closely related Malassezia species

Intra-chromosomal or inter-chromosomal genomic rearrangements often lead to speciation. Loss or gain of a centromere leads to alterations in chromosome number in closely related species. Thus, centromeres can enable tracing the path of evolution from the ancestral to a derived state. The Malassezia species complex of the phylum Basiodiomycota shows remarkable diversity in chromosome number ranging between six and nine chromosomes. To understand these transitions, we experimentally identified all eight centromeres as binding sites of an evolutionarily conserved outer kinetochore protein Mis12/Mtw1 in M. sympodialis. The 3 to 5 kb centromere regions share an AT-rich, poorly transcribed core region enriched with a 12 bp consensus motif. We also mapped nine such AT-rich centromeres in M. globosa and the related species Malassezia restricta and Malassezia slooffiae. While eight predicted centromeres were found within conserved synteny blocks between these species and M. sympodialis, the remaining centromere in M. globosa (MgCEN2) or its orthologous centromere in M. slooffiae (MslCEN4) and M. restricta (MreCEN8) mapped to a synteny breakpoint compared with M. sympodialis. Taken together, we provide evidence that breakage and loss of a centromere (CEN2) in an ancestral Malassezia species possessing nine chromosomes resulted in fewer chromosomes in M. sympodialis. Strikingly, the predicted centromeres of all closely related Malassezia species map to an AT-rich core on each chromosome that also shows enrichment of the 12 bp sequence motif. We propose that centromeres are fragile AT-rich sites driving karyotype diversity through breakage and inactivation in these and other species.


April 21, 2020  |  

Chromosome-level assembly of the common lizard (Zootoca vivipara) genome

Squamate reptiles exhibit high variation in their traits and geographical distribution and are therefore fascinating taxa for evolutionary and ecological research. However, high-quality genomic recourses are very limited for this group of species, which inhibits some research efforts. To address this gap, we assembled a high-quality genome of the common lizard Zootoca vivipara (Lacertidae) using a combination of high coverage Illumina (shotgun and mate-pair) and PacBio sequence data, with RNAseq data and genetic linkage maps. The 1.46 Gbp genome assembly has scaffold N50 of 11.52 Mbp with N50 contig size of 220.4 Kbp and only 2.96% gaps. A BUSCO analysis indicates that 97.7% of the single-copy Tetrapoda orthologs were recovered in the assembly. In total 19,829 gene models were annotated in the genome using a combination of three ab initio and homology-based methods. To improve the chromosome-level assembly, we generated a high-density linkage map from wild-caught families and developed a novel analytical pipeline to accommodate multiple paternity and unknown father genotypes. We successfully anchored and oriented almost 90% of the genome on 19 linkage groups. This annotated and oriented chromosome-level reference genome represents a valuable resource to facilitate evolutionary studies in squamate reptiles.


April 21, 2020  |  

A chromosome-level genome of black rockfish, Sebastes schlegelii, provides insights into the evolution of live birth.

Black rockfish (Sebastes schlegelii) is a teleost species where eggs are fertilized internally and retained in the maternal reproductive system, where they undergo development until live birth (termed viviparity). In the present study, we report a chromosome-level black rockfish genome assembly. High-throughput transcriptome analysis (RNA-seq and ATAC-seq), coupled with in situ hybridization (ISH) and immunofluorescence, identify several candidate genes for maternal preparation, sperm storage and release, and hatching. We propose that zona pellucida (ZP) proteins retain sperm at the oocyte envelope, while genes in two distinct astacin metalloproteinase subfamilies serve to release sperm from the ZP and free the embryo from chorion at pre-hatching stage. Finally, we present a model of black rockfish reproduction, and propose that the rockfish ovarian wall has a similar function to the uterus of mammals. Taken together, these genomic data reveal unprecedented insights into the evolution of an unusual teleost life history strategy, and provide a sound foundation for studying viviparity in non-mammalian vertebrates and an invaluable resource for rockfish ecological and evolutionary research. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.


April 21, 2020  |  

Lateral transfers of large DNA fragments spread functional genes among grasses.

A fundamental tenet of multicellular eukaryotic evolution is that vertical inheritance is paramount, with natural selection acting on genetic variants transferred from parents to offspring. This lineal process means that an organism’s adaptive potential can be restricted by its evolutionary history, the amount of standing genetic variation, and its mutation rate. Lateral gene transfer (LGT) theoretically provides a mechanism to bypass many of these limitations, but the evolutionary importance and frequency of this process in multicellular eukaryotes, such as plants, remains debated. We address this issue by assembling a chromosome-level genome for the grass Alloteropsis semialata, a species surmised to exhibit two LGTs, and screen it for other grass-to-grass LGTs using genomic data from 146 other grass species. Through stringent phylogenomic analyses, we discovered 57 additional LGTs in the A. semialata nuclear genome, involving at least nine different donor species. The LGTs are clustered in 23 laterally acquired genomic fragments that are up to 170 kb long and have accumulated during the diversification of Alloteropsis. The majority of the 59 LGTs in A. semialata are expressed, and we show that they have added functions to the recipient genome. Functional LGTs were further detected in the genomes of five other grass species, demonstrating that this process is likely widespread in this globally important group of plants. LGT therefore appears to represent a potent evolutionary force capable of spreading functional genes among distantly related grass species. Copyright © 2019 the Author(s). Published by PNAS.


April 21, 2020  |  

High satellite repeat turnover in great apes studied with short- and long-read technologies.

Satellite repeats are a structural component of centromeres and telomeres, and in some instances their divergence is known to drive speciation. Due to their highly repetitive nature, satellite sequences have been understudied and underrepresented in genome assemblies. To investigate their turnover in great apes, we studied satellite repeats of unit sizes up to 50?bp in human, chimpanzee, bonobo, gorilla, and Sumatran and Bornean orangutans, using unassembled short and long sequencing reads. The density of satellite repeats, as identified from accurate short reads (Illumina), varied greatly among great ape genomes. These were dominated by a handful of abundant repeated motifs, frequently shared among species, which formed two groups: (1) the (AATGG)n repeat (critical for heat shock response) and its derivatives; and (2) subtelomeric 32-mers involved in telomeric metabolism. Using the densities of abundant repeats, individuals could be classified into species. However clustering did not reproduce the accepted species phylogeny, suggesting rapid repeat evolution. Several abundant repeats were enriched in males vs. females; using Y chromosome assemblies or FIuorescent In Situ Hybridization, we validated their location on the Y. Finally, applying a novel computational tool, we identified many satellite repeats completely embedded within long Oxford Nanopore and Pacific Biosciences reads. Such repeats were up to 59?kb in length and consisted of perfect repeats interspersed with other similar sequences. Our results based on sequencing reads generated with three different technologies provide the first detailed characterization of great ape satellite repeats, and open new avenues for exploring their functions. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


April 21, 2020  |  

Fast and accurate long-read assembly with wtdbg2

Existing long-read assemblers require tens of thousands of CPU hours to assemble a human genome and are being outpaced by sequencing technologies in terms of both throughput and cost. We developed a novel long-read assembler wtdbg2 that, for human data, is tens of times faster than published tools while achieving comparable contiguity and accuracy. It represents a significant algorithmic advance and paves the way for population-scale long-read assembly in future.


April 21, 2020  |  

Strengths and potential pitfalls of hay-transfer for ecological restoration revealed by RAD-seq analysis in floodplain Arabis species

Achieving high intraspecific genetic diversity is a critical goal in ecological restoration as it increases the adaptive potential and long-term resilience of populations. Thus, we investigated genetic diversity within and between pristine sites in a fossil floodplain and compared it to sites restored by hay-transfer between 1997 and 2014. RAD-seq genotyping revealed that the stenoecious flood-plain species Arabis nemorensis is co-occurring with individuals that, based on ploidy, ITS-sequencing and morphology, probably belong to the close relative Arabis sagittata, which has a documented preference for dry calcareous grasslands but has not been reported in floodplain meadows. We show that hay-transfer maintains genetic diversity for both species. Additionally, in A. sagittata, transfer from multiple genetically isolated pristine sites resulted in restored sites with increased diversity and admixed local genotypes. In A. nemorensis, transfer did not create novel admixture dynamics because genetic diversity between pristine sites was less differentiated. Thus, the effects of hay-transfer on genetic diversity also depend on the genetic makeup of the donor communities of each species, especially when local material is mixed. Our results demonstrate the efficiency of hay-transfer for habitat restoration and emphasize the importance of pre-restoration characterization of micro-geographic patterns of intraspecific diversity of the community to guarantee that restoration practices reach their goal, i.e. maximize the adaptive potential of the entire restored plant community. Overlooking these patterns may alter the balance between species in the community. Additionally, our comparison of summary statistics obtained from de novo and reference-based RAD-seq pipelines shows that the genomic impact of restoration can be reliably monitored in species lacking prior genomic knowledge.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.