Domestication of clonally propagated crops such as pineapple from South America was hypothesized to be a ‘one-step operation’. We sequenced the genome of Ananas comosus var. bracteatus CB5 and assembled 513?Mb into 25 chromosomes with 29,412 genes. Comparison of the genomes of CB5, F153 and MD2 elucidated the genomic basis of fiber production, color formation, sugar accumulation and fruit maturation. We also resequenced 89 Ananas genomes. Cultivars ‘Smooth Cayenne’ and ‘Queen’ exhibited ancient and recent admixture, while ‘Singapore Spanish’ supported a one-step operation of domestication. We identified 25 selective sweeps, including a strong sweep containing a pair of tandemly duplicated bromelain inhibitors. Four candidate genes for self-incompatibility were linked in F153, but were not functional in self-compatible CB5. Our findings support the coexistence of sexual recombination and a one-step operation in the domestication of clonally propagated crops. This work guides the exploration of sexual and asexual domestication trajectories in other clonally propagated crops.
Evolution of a 72-kb cointegrant, conjugative multiresistance plasmid from early community-associated methicillin-resistant Staphylococcus aureus isolates.
Horizontal transfer of plasmids encoding antimicrobial-resistance and virulence determinants has been instrumental in Staphylococcus aureus evolution, including the emergence of community-associated methicillin-resistant S. aureus (CA-MRSA). In the early 1990s the first CA-MRSA isolated in Western Australia (WA), WA-5, encoded cadmium, tetracycline and penicillin-resistance genes on plasmid pWBG753 (~30 kb). WA-5 and pWBG753 appeared only briefly in WA, however, fusidic-acid-resistance plasmids related to pWBG753 were also present in the first European CA-MRSA at the time. Here we characterized a 72-kb conjugative plasmid pWBG731 present in multiresistant WA-5-like clones from the same period. pWBG731 was a cointegrant formed from pWBG753 and a pWBG749-family conjugative plasmid. pWBG731 carried mupirocin, trimethoprim, cadmium and penicillin-resistance genes. The stepwise evolution of pWBG731 likely occurred through the combined actions of IS257, IS257-dependent miniature inverted-repeat transposable elements (MITEs) and the BinL resolution system of the ß-lactamase transposon Tn552 An evolutionary intermediate ~42-kb non-conjugative plasmid pWBG715, possessed the same resistance genes as pWBG731 but retained an integrated copy of the small tetracycline-resistance plasmid pT181. IS257 likely facilitated replacement of pT181 with conjugation genes on pWBG731, thus enabling autonomous transfer. Like conjugative plasmid pWBG749, pWBG731 also mobilized non-conjugative plasmids carrying oriT mimics. It seems likely that pWBG731 represents the product of multiple recombination events between the WA-5 pWBG753 plasmid and other mobile genetic elements present in indigenous CA-MSSA. The molecular evolution of pWBG731 saliently illustrates how diverse mobile genetic elements can together facilitate rapid accrual and horizontal dissemination of multiresistance in S. aureus CA-MRSA.Copyright © 2019 American Society for Microbiology.
Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline
Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and allow for annotation of TEs. There are numerous methods for each class of elements with unknown relative performance metrics. We benchmarked existing programs based on a curated library of rice TEs. Using the most robust programs, we created a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a condensed TE library for annotations of structurally intact and fragmented elements. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.List of abbreviationsTETransposable ElementsLTRLong Terminal RepeatLINELong Interspersed Nuclear ElementSINEShort Interspersed Nuclear ElementMITEMiniature Inverted Transposable ElementTIRTerminal Inverted RepeatTSDTarget Site DuplicationTPTrue PositivesFPFalse PositivesTNTrue NegativeFNFalse NegativesGRFGeneric Repeat FinderEDTAExtensive de-novo TE Annotator
TSD: A Computational Tool To Study the Complex Structural Variants Using PacBio Targeted Sequencing Data.
PacBio sequencing is a powerful approach to study DNA or RNA sequences in a longer scope. It is especially useful in exploring the complex structural variants generated by random integration or multiple rearrangement of endogenous or exogenous sequences. Here, we present a tool, TSD, for complex structural variant discovery using PacBio targeted sequencing data. It allows researchers to identify and visualize the genomic structures of targeted sequences by unlimited splitting, alignment and assembly of long PacBio reads. Application to the sequencing data derived from an HBV integrated human cell line(PLC/PRF/5) indicated that TSD could recover the full profile of HBV integration events, especially for the regions with the complex human-HBV genome integrations and multiple HBV rearrangements. Compared to other long read analysis tools, TSD showed a better performance for detecting complex genomic structural variants. TSD is publicly available at: https://github.com/menggf/tsd. Copyright © 2019 Meng et al.
Vertebrate genomes contain a record of retroviruses that invaded the germlines of ancestral hosts and are passed to offspring as endogenous retroviruses (ERVs). ERVs can impact host function since they contain the necessary sequences for expression within the host. Dogs are an important system for the study of disease and evolution, yet no substantiated reports of infectious retroviruses in dogs exist. Here, we utilized Illumina whole genome sequence data to assess the origin and evolution of a recently active gammaretroviral lineage in domestic and wild canids.We identified numerous recently integrated loci of a canid-specific ERV-Fc sublineage within Canis, including 58 insertions that were absent from the reference assembly. Insertions were found throughout the dog genome including within and near gene models. By comparison of orthologous occupied sites, we characterized element prevalence across 332 genomes including all nine extant canid species, revealing evolutionary patterns of ERV-Fc segregation among species as well as subpopulations.Sequence analysis revealed common disruptive mutations, suggesting a predominant form of ERV-Fc spread by trans complementation of defective proviruses. ERV-Fc activity included multiple circulating variants that infected canid ancestors from the last 20 million to within 1.6 million years, with recent bursts of germline invasion in the sublineage leading to wolves and dogs.