Arabica coffee, revered for its taste and aroma, has a complex genome. It is an allotetraploid (2n=4x=44) with a genome size of approximately 1.3 Gb, derived from the recent (< 0.6 Mya) hybridization of two diploid progenitors (2n=2x=22), C. canephora (710 Mb) and C. eugenioides (670 Mb). Both parental species diverged recently (< 4.2Mya) and their genomes are highly homologous. To facilitate assembly, a dihaploid plant was chosen for sequencing. Initial genome assembly attempts with short read data produced an assembly covering 1,031 Mb of the C. arabica genome with a contig L50 of 9kb. By implementation of long read…
PacBio Sequencing is characterized by very long sequence reads (averaging > 10,000 bases), lack of GC-bias, and high consensus accuracy. These features have allowed the method to provide a new gold standard in de novo genome assemblies, producing highly contiguous (contig N50 > 1 Mb) and accurate (> QV 50) genome assemblies. We will briefly describe the technology and then highlight the full workflow, from sample preparation through sequencing to data analysis, on examples of insect genome assemblies, and illustrate the difference these high-quality genomes represent with regard to biological insights, compared to fragmented draft assemblies generated by short-read sequencing.
Genes are the future of coffee. Not nitro cold brewing or beans pooped out by civets, but genes. And coffee’s gene-fueled future just drew nearer, now that scientists have sequenced the genome of the Coffea arabica coffee plant—the species that makes up the vast majority of global production—and made the data public. That means the world is in for a coffee renaissance, as breeders use the information to develop new plant varieties—think new flavors and better resistance to cold and disease. That means more coffee grown in more places, a big deal as global warming throws local climates into chaos.
One of the longstanding challenges in infectious disease has been the lack of high-quality reference genomes. However, developments in genome sequencing are helping researchers overcome this barrier. Recently, highly contiguous genome assemblies of Plasmodium falciparum, Aedes aegypti, and multiple trypanosomes have become available. The number of reference genomes for bacteria that cause infectious disease is similarly expanding rapidly. In this webinar Meredith Ashby discusses how these new resources are already yielding new biological insights into critical questions in infectious disease research, including how parasites evade the immune system add how pathogens are adapting to evolutionary pressures.
Horizontal transfer of plasmids encoding antimicrobial-resistance and virulence determinants has been instrumental in Staphylococcus aureus evolution, including the emergence of community-associated methicillin-resistant S. aureus (CA-MRSA). In the early 1990s the first CA-MRSA isolated in Western Australia (WA), WA-5, encoded cadmium, tetracycline and penicillin-resistance genes on plasmid pWBG753 (~30 kb). WA-5 and pWBG753 appeared only briefly in WA, however, fusidic-acid-resistance plasmids related to pWBG753 were also present in the first European CA-MRSA at the time. Here we characterized a 72-kb conjugative plasmid pWBG731 present in multiresistant WA-5-like clones from the same period. pWBG731 was a cointegrant formed from pWBG753 and a…
Genome-wide association studies (GWAS) have identified many genomic loci associated with risk for schizophrenia, but unambiguous identification of the relationship between disease-associated variants and specific genes, and in particular their effect on risk conferring transcripts, has proven difficult. To better understand the specific molecular mechanism(s) at the schizophrenia locus in 11q25, we undertook cis expression quantitative trait loci (cis-eQTL) mapping for this 2 megabase genomic region using postmortem human brain samples. To comprehensively assess the effects of genetic risk upon local expression, we evaluated multiple transcript features: genes, exons, and exon-exon junctions in multiple brain regions-dorsolateral prefrontal cortex (DLPFC), hippocampus,…
Xylella fastidiosa is an economically important bacterial plant pathogen. With insights gained from 72 genomes, this study investigated differences among the three main subspecies, which have allopatric origins: X. fastidiosa subsp. fastidiosa, multiplex, and pauca The origin of recombinogenic X. fastidiosa subsp. morus and sandyi was also assessed. The evolutionary rate of the 622 genes of the species core genome was estimated at the scale of an X. fastidiosa subsp. pauca subclade (7.62?×?10-7 substitutions per site per year), which was subsequently used to estimate divergence time for the subspecies and introduction events. The study characterized genes present in the accessory…
HIV elite controllers represent a remarkable minority of patients who maintain normal CD4+ T-cell counts and low or undetectable viral loads for decades in the absence of antiretroviral therapy. To examine the possible contribution of virus attenuation to elite control, we obtained a primary HIV-1 isolate from an elite controller who had been infected for 19?years, the last 10 of which were in the absence of antiretroviral therapy. Full-length sequencing of this isolate revealed a highly unusual V1 domain in Envelope (Env). The V1 domain in this HIV-1 strain was 49 amino acids, placing it in the top 1% of…
Acer yangbiense is a newly described critically endangered endemic maple tree confined to Yangbi County in Yunnan Province in Southwest China. It was included in a programme for rescuing the most threatened species in China, focusing on “plant species with extremely small populations (PSESP)”.We generated 64, 94, and 110 Gb of raw DNA sequences and obtained a chromosome-level genome assembly of A. yangbiense through a combination of Pacific Biosciences Single-molecule Real-time, Illumina HiSeq X, and Hi-C mapping, respectively. The final genome assembly is ~666 Mb, with 13 chromosomes covering ~97% of the genome and scaffold N50 sizes of 45 Mb.…
Nematode-trapping fungi (NTF) are a large and diverse group of fungi, which may switch from a saprotrophic to a predatory lifestyle if nematodes are present. Different fungi have developed different trapping devices, ranging from adhesive cells to constricting rings. After trapping, fungal hyphae penetrate the worm, secrete lytic enzymes and form a hyphal network inside the body. We sequenced the genome of Duddingtonia flagrans, a biotechnologically important NTF used to control nematode populations in fields. The 36.64 Mb genome encodes 9,927 putative proteins, among which are more than 638 predicted secreted proteins. Most secreted proteins are lytic enzymes, but more…
Circulating DNA in plasma consists of short DNA fragments. The biological processes generating such fragments are not well understood. DNASE1L3 is a secreted DNASE1-like nuclease capable of digesting DNA in chromatin, and its absence causes anti-DNA responses and autoimmunity in humans and mice. We found that the deletion of Dnase1l3 in mice resulted in aberrations in the fragmentation of plasma DNA. Such aberrations included an increase in short DNA molecules below 120 bp, which was positively correlated with anti-DNA antibody levels. We also observed an increase in long, multinucleosomal DNA molecules and decreased frequencies of the most common end motifs…
To better understand the immune system of shrimp, this study combined PacBio isoform sequencing (Iso-Seq) and Illumina paired-end short reads sequencing methods to discover full-length immune-related molecules of the Pacific white shrimp, Litopenaeus vannamei. A total of 72,648 nonredundant full-length transcripts (unigenes) were generated with an average length of 2545 bp from five main tissues, including the hepatopancreas, cardiac stomach, heart, muscle, and pyloric stomach. These unigenes exhibited a high annotation rate (62,164, 85.57%) when compared against NR, NT, Swiss-Prot, Pfam, GO, KEGG and COG databases. A total of 7544 putative long noncoding RNAs (lncRNAs) were detected and 1164 nonredundant…
Infectious disease is both a major force of selection in nature and a prime cause of yield loss in agriculture. In plants, disease resistance is often conferred by nucleotide-binding leucine-rich repeat (NLR) proteins, intracellular immune receptors that recognize pathogen proteins and their effects on the host. Consistent with extensive balancing and positive selection, NLRs are encoded by one of the most variable gene families in plants, but the true extent of intraspecific NLR diversity has been unclear. Here, we define a nearly complete species-wide pan-NLRome in Arabidopsis thaliana based on sequence enrichment and long-read sequencing. The pan-NLRome largely saturates with…
The development of clustered regularly interspaced short-palindromic repeat (CRISPR)-Cas systems for genome editing has transformed the way life science research is conducted and holds enormous potential for the treatment of disease as well as for many aspects of biotech- nology. Here, I provide a personal perspective on the development of CRISPR-Cas9 for genome editing within the broader context of the field and discuss our work to discover novel Cas effectors and develop them into additional molecular tools. The initial demonstra- tion of Cas9-mediated genome editing launched the development of many other technologies, enabled new lines of biological inquiry, and motivated…
Corals comprise a biomineralizing cnidarian, dinoflagellate algal symbionts, and associated microbiome of prokaryotes and viruses. Ongoing efforts to conserve coral reefs by identifying the major stress response pathways and thereby laying the foundation to select resistant genotypes rely on a robust genomic foundation. Here we generated and analyzed a high quality long-read based ~886 Mbp nuclear genome assembly and transcriptome data from the dominant rice coral, Montipora capitata from Hawai’i. Our work provides insights into the architecture of coral genomes and shows how they differ in size and gene inventory, putatively due to population size variation. We describe a recent…