In this study we demonstrate the utility of Single-Molecule Real Time SMRT sequencing to detect variants and to recapitulate whole mitochondrial genomes in an association study of Metabolic syndrome using samples from a well-studied cohort from Micronesia. The Micronesian island of Kosrae is a rare genetic isolate that offers significant advantages for genetic studies of human disease. Kosrae suffers from one of the highest rates of MetS (41%), obesity (52%), and diabetes (17%) globally and has a homogeneous environment making this an excellent population in which to study these significant health problems. We are conducting family-based association analyses aimed at…
Goats are specialized in dairy, meat and fiber production, being adapted to a wide range of environmental conditions and having a large economic impact in developing countries. In the last years, there have been dramatic advances in the knowledge of the structure and diversity of the goat genome/transcriptome and in the development of genomic tools, rapidly narrowing the gap between goat and related species such as cattle and sheep. Major advances are: 1) publication of a de novo goat genome reference sequence; 2) Development of whole genome high density RH maps, and; 3) Design of a commercial 50K SNP array.…
The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and…
The recent advent of long-read sequencing technologies is expected to provide reasonable answers to genetic challenges unresolvable by short-read sequencing, primarily the inability to accurately study structural variations, copy number variations, and homologous repeats in complex parts of the genome. However, long-read sequencing comes along with higher rates of random short deletions and insertions, and single nucleotide errors. The relatively higher sequencing accuracy of short-read sequencing has kept it as the first choice of screening for single nucleotide variants and short deletions and insertions. Albeit, short-read sequencing still suffers from systematic errors that tend to occur at specific positions where…
Satellite repeats are a structural component of centromeres and telomeres, and in some instances their divergence is known to drive speciation. Due to their highly repetitive nature, satellite sequences have been understudied and underrepresented in genome assemblies. To investigate their turnover in great apes, we studied satellite repeats of unit sizes up to 50?bp in human, chimpanzee, bonobo, gorilla, and Sumatran and Bornean orangutans, using unassembled short and long sequencing reads. The density of satellite repeats, as identified from accurate short reads (Illumina), varied greatly among great ape genomes. These were dominated by a handful of abundant repeated motifs, frequently…
Recent studies suggest that closely related species can accumulate substantial genetic and phenotypic differences despite ongoing gene flow, thus challenging traditional ideas regarding the genetics of speciation. Baboons (genus Papio) are Old World monkeys consisting of six readily distinguishable species. Baboon species hybridize in the wild, and prior data imply a complex history of differentiation and introgression. We produced a reference genome assembly for the olive baboon (Papio anubis) and whole-genome sequence data for all six extant species. We document multiple episodes of admixture and introgression during the radiation of Papio baboons, thus demonstrating their value as a model of…
Crucihimalaya himalaica, a close relative of Arabidopsis and Capsella, grows on the Qinghai-Tibet Plateau (QTP) about 4,000 m above sea level and represents an attractive model system for studying speciation and ecological adaptation in extreme environments. We assembled a draft genome sequence of 234.72 Mb encoding 27,019 genes and investigated its origin and adaptive evolutionary mechanisms. Phylogenomic analyses based on 4,586 single-copy genes revealed that C. himalaica is most closely related to Capsella (estimated divergence 8.8 to 12.2 Mya), whereas both species form a sister clade to Arabidopsis thaliana and Arabidopsis lyrata, from which they diverged between 12.7 and 17.2…
Multispecies host-parasite evolution is common, but how parasites evolve after speciating remains poorly understood. Shared evolutionary history and physiology may propel species along similar evolutionary trajectories whereas pursuing different strategies can reduce competition. We test these scenarios in the economically important association between honey bees and ectoparasitic mites by sequencing the genomes of the sister mite species Varroa destructor and Varroa jacobsoni. These genomes were closely related, with 99.7% sequence identity. Among the 9,628 orthologous genes, 4.8% showed signs of positive selection in at least one species. Divergent selective trajectories were discovered in conserved chemosensory gene families (IGR, SNMP), and…