Recent studies suggest that closely related species can accumulate substantial genetic and phenotypic differences despite ongoing gene flow, thus challenging traditional ideas regarding the genetics of speciation. Baboons (genus Papio) are Old World monkeys consisting of six readily distinguishable species. Baboon species hybridize in the wild, and prior data imply a complex history of differentiation and introgression. We produced a reference genome assembly for the olive baboon (Papio anubis) and whole-genome sequence data for all six extant species. We document multiple episodes of admixture and introgression during the radiation of Papio baboons, thus demonstrating their value as a model of…
Contaminant sequences that appear in published genomes can cause numerous problems for downstream analyses, particularly for evolutionary studies and metagenomics projects. Our large-scale scan of complete and draft bacterial and archaeal genomes in the NCBI RefSeq database reveals that 2250 genomes are contaminated by human sequence. The contaminant sequences derive primarily from high-copy human repeat regions, which themselves are not adequately represented in the current human reference genome, GRCh38. The absence of the sequences from the human assembly offers a likely explanation for their presence in bacterial assemblies. In some cases, the contaminating contigs have been erroneously annotated as containing…
In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number…
Several algorithms have been developed that use high-throughput sequencing technology to characterize structural variations (SVs). Most of the existing approaches focus on detecting relatively simple types of SVs such as insertions, deletions and short inversions. In fact, complex SVs are of crucial importance and several have been associated with genomic disorders. To better understand the contribution of complex SVs to human disease, we need new algorithms to accurately discover and genotype such variants. Additionally, due to similar sequencing signatures, inverted duplications or gene conversion events that include inverted segmental duplications are often characterized as simple inversions, likewise, duplications and gene…
We report a family with progressive myoclonic epilepsy who underwent whole-exome sequencing but was negative for pathogenic variants. Similar clinical courses of a devastating neurodegenerative phenotype of two affected siblings were highly suggestive of a genetic etiology, which indicates that the survey of genetic variation by whole-exome sequencing was not comprehensive. To investigate the presence of a variant that remained unrecognized by standard genetic testing, PacBio long-read sequencing was performed. Structural variant (SV) detection using low-coverage (6×) whole-genome sequencing called 17,165 SVs (7,216 deletions and 9,949 insertions). Our SV selection narrowed down potential candidates to only five SVs (two deletions…
Vertebrate genomes contain a record of retroviruses that invaded the germlines of ancestral hosts and are passed to offspring as endogenous retroviruses (ERVs). ERVs can impact host function since they contain the necessary sequences for expression within the host. Dogs are an important system for the study of disease and evolution, yet no substantiated reports of infectious retroviruses in dogs exist. Here, we utilized Illumina whole genome sequence data to assess the origin and evolution of a recently active gammaretroviral lineage in domestic and wild canids.We identified numerous recently integrated loci of a canid-specific ERV-Fc sublineage within Canis, including 58…