Menu
April 21, 2020  |  

A critical comparison of technologies for a plant genome sequencing project.

A high-quality genome sequence of any model organism is an essential starting point for genetic and other studies. Older clone-based methods are slow and expensive, whereas faster, cheaper short-read-only assemblies can be incomplete and highly fragmented, which minimizes their usefulness. The last few years have seen the introduction of many new technologies for genome assembly. These new technologies and associated new algorithms are typically benchmarked on microbial genomes or, if they scale appropriately, on larger (e.g., human) genomes. However, plant genomes can be much more repetitive and larger than the human genome, and plant biochemistry often makes obtaining high-quality DNA that is free from contaminants difficult. Reflecting their challenging nature, we observe that plant genome assembly statistics are typically poorer than for vertebrates.Here, we compare Illumina short read, Pacific Biosciences long read, 10x Genomics linked reads, Dovetail Hi-C, and BioNano Genomics optical maps, singly and combined, in producing high-quality long-range genome assemblies of the potato species Solanum verrucosum. We benchmark the assemblies for completeness and accuracy, as well as DNA compute requirements and sequencing costs.The field of genome sequencing and assembly is reaching maturity, and the differences we observe between assemblies are surprisingly small. We expect that our results will be helpful to other genome projects, and that these datasets will be used in benchmarking by assembly algorithm developers. © The Author(s) 2019. Published by Oxford University Press.


April 21, 2020  |  

Comparative Transcriptomic Profiling of Yersinia enterocolitica O:3 and O:8 Reveals Major Expression Differences of Fitness- and Virulence-Relevant Genes Indicating Ecological Separation.

Yersinia enterocolitica is a zoonotic pathogen and an important cause of bacterial gastrointestinal infections in humans. Large-scale population genomic analyses revealed genetic and phenotypic diversity of this bacterial species, but little is known about the differences in the transcriptome organization, small RNA (sRNA) repertoire, and transcriptional output. Here, we present the first comparative high-resolution transcriptome analysis of Y. enterocolitica strains representing highly pathogenic phylogroup 2 (serotype O:8) and moderately pathogenic phylogroup 3 (serotype O:3) grown under four infection-relevant conditions. Our transcriptome sequencing (RNA-seq) approach revealed 1,299 and 1,076 transcriptional start sites and identified strain-specific sRNAs that could contribute to differential regulation among the phylogroups. Comparative transcriptomics further uncovered major gene expression differences, in particular, in the temperature-responsive regulon. Multiple virulence-relevant genes are differentially regulated between the two strains, supporting an ecological separation of phylogroups with certain niche-adapted properties. Strong upregulation of the ystA enterotoxin gene in combination with constitutive high expression of cell invasion factor InvA further showed that the toxicity of recent outbreak O:3 strains has increased. Overall, our report provides new insights into the specific transcriptome organization of phylogroups 2 and 3 and reveals gene expression differences contributing to the substantial phenotypic differences that exist between the lineages. IMPORTANCE Yersinia enterocolitica is a major diarrheal pathogen and is associated with a large range of gut-associated diseases. Members of this species have evolved into different phylogroups with genotypic variations. We performed the first characterization of the Y. enterocolitica transcriptional landscape and tracked the consequences of the genomic variations between two different pathogenic phylogroups by comparing their RNA repertoire, promoter usage, and expression profiles under four different virulence-relevant conditions. Our analysis revealed major differences in the transcriptional outputs of the closely related strains, pointing to an ecological separation in which one is more adapted to an environmental lifestyle and the other to a mostly mammal-associated lifestyle. Moreover, a variety of pathoadaptive alterations, including alterations in acid resistance genes, colonization factors, and toxins, were identified which affect virulence and host specificity. This illustrates that comparative transcriptomics is an excellent approach to discover differences in the functional output from closely related genomes affecting niche adaptation and virulence, which cannot be directly inferred from DNA sequences.


April 21, 2020  |  

Divergent evolution in the genomes of closely related lacertids, Lacerta viridis and L. bilineata, and implications for speciation.

Lacerta viridis and Lacerta bilineata are sister species of European green lizards (eastern and western clades, respectively) that, until recently, were grouped together as the L. viridis complex. Genetic incompatibilities were observed between lacertid populations through crossing experiments, which led to the delineation of two separate species within the L. viridis complex. The population history of these sister species and processes driving divergence are unknown. We constructed the first high-quality de novo genome assemblies for both L. viridis and L. bilineata through Illumina and PacBio sequencing, with annotation support provided from transcriptome sequencing of several tissues. To estimate gene flow between the two species and identify factors involved in reproductive isolation, we studied their evolutionary history, identified genomic rearrangements, detected signatures of selection on non-coding RNA, and on protein-coding genes.Here we show that gene flow was primarily unidirectional from L. bilineata to L. viridis after their split at least 1.15 million years ago. We detected positive selection of the non-coding repertoire; mutations in transcription factors; accumulation of divergence through inversions; selection on genes involved in neural development, reproduction, and behavior, as well as in ultraviolet-response, possibly driven by sexual selection, whose contribution to reproductive isolation between these lacertid species needs to be further evaluated.The combination of short and long sequence reads resulted in one of the most complete lizard genome assemblies. The characterization of a diverse array of genomic features provided valuable insights into the demographic history of divergence among European green lizards, as well as key species differences, some of which are candidates that could have played a role in speciation. In addition, our study generated valuable genomic resources that can be used to address conservation-related issues in lacertids. © The Author(s) 2018. Published by Oxford University Press.


April 21, 2020  |  

Genome assembly and annotation of the Trichoplusia ni Tni-FNL insect cell line enabled by long-read technologies.

Trichoplusiani derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in this host, we performed de novo genome assembly of the Trichoplusiani-derived cell line Tni-FNL.By integration of PacBio single-molecule sequencing, Bionano optical mapping, and 10X Genomics linked-reads data, we have produced a draft genome assembly of Tni-FNL.Our assembly contains 280 scaffolds, with a N50 scaffold size of 2.3 Mb and a total length of 359 Mb. Annotation of the Tni-FNL genome resulted in 14,101 predicted genes and 93.2% of the predicted proteome contained recognizable protein domains. Ortholog searches within the superorder Holometabola provided further evidence of high accuracy and completeness of the Tni-FNL genome assembly.This first draft Tni-FNL genome assembly was enabled by complementary long-read technologies and represents a high-quality, well-annotated genome that provides novel insight into the complexity of this insect cell line and can serve as a reference for future large-scale genome engineering work in this and other similar recombinant protein production hosts.


April 21, 2020  |  

Genomic analysis of bacteria in the Acute Oak Decline pathobiome.

The UK’s native oak is under serious threat from Acute Oak Decline (AOD). Stem tissue necrosis is a primary symptom of AOD and several bacteria are associated with necrotic lesions. Two members of the lesion pathobiome, Brenneria goodwinii and Gibbsiella quercinecans, have been identified as causative agents of tissue necrosis. However, additional bacteria including Lonsdalea britannica and Rahnella species have been detected in the lesion microbiome, but their role in tissue degradation is unclear. Consequently, information on potential genome-encoded mechanisms for tissue necrosis is critical to understand the role and mechanisms used by bacterial members of the lesion pathobiome in the aetiology of AOD. Here, the whole genomes of bacteria isolated from AOD-affected trees were sequenced, annotated and compared against canonical bacterial phytopathogens and non-pathogenic symbionts. Using orthologous gene inference methods, shared virulence genes that retain the same function were identified. Furthermore, functional annotation of phytopathogenic virulence genes demonstrated that all studied members of the AOD lesion microbiota possessed genes associated with phytopathogens. However, the genome of B. goodwinii was the most characteristic of a necrogenic phytopathogen, corroborating previous pathological and metatranscriptomic studies that implicate it as the key causal agent of AOD lesions. Furthermore, we investigated the genome sequences of other AOD lesion microbiota to understand the potential ability of microbes to cause disease or contribute to pathogenic potential of organisms isolated from this complex pathobiome. The role of these members remains uncertain but some such as G. quercinecans may contribute to tissue necrosis through the release of necrotizing enzymes and may help more dangerous pathogens activate and realize their pathogenic potential or they may contribute as secondary/opportunistic pathogens with the potential to act as accessory species for B. goodwinii. We demonstrate that in combination with ecological data, whole genome sequencing provides key insights into the pathogenic potential of bacterial species whether they be phytopathogens, part-contributors or stimulators of the pathobiome.


April 21, 2020  |  

De Novo Sequencing and Hybrid Assembly of the Biofuel Crop Jatropha curcas L.: Identification of Quantitative Trait Loci for Geminivirus Resistance.

Jatropha curcas is an important perennial, drought tolerant plant that has been identified as a potential biodiesel crop. We report here the hybrid de novo genome assembly of J. curcas generated using Illumina and PacBio sequencing technologies, and identification of quantitative loci for Jatropha Mosaic Virus (JMV) resistance. In this study, we generated scaffolds of 265.7 Mbp in length, which correspond to 84.8% of the gene space, using Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis. Additionally, 96.4% of predicted protein-coding genes were captured in RNA sequencing data, which reconfirms the accuracy of the assembled genome. The genome was utilized to identify 12,103 dinucleotide simple sequence repeat (SSR) markers, which were exploited in genetic diversity analysis to identify genetically distinct lines. A total of 207 polymorphic SSR markers were employed to construct a genetic linkage map for JMV resistance, using an interspecific F2 mapping population involving susceptible J. curcas and resistant Jatropha integerrima as parents. Quantitative trait locus (QTL) analysis led to the identification of three minor QTLs for JMV resistance, and the same has been validated in an alternate F2 mapping population. These validated QTLs were utilized in marker-assisted breeding for JMV resistance. Comparative genomics of oil-producing genes across selected oil producing species revealed 27 conserved genes and 2986 orthologous protein clusters in Jatropha. This reference genome assembly gives an insight into the understanding of the complex genetic structure of Jatropha, and serves as source for the development of agronomically improved virus-resistant and oil-producing lines.


April 21, 2020  |  

High-throughput amplicon sequencing of the full-length 16S rRNA gene with single-nucleotide resolution.

Targeted PCR amplification and high-throughput sequencing (amplicon sequencing) of 16S rRNA gene fragments is widely used to profile microbial communities. New long-read sequencing technologies can sequence the entire 16S rRNA gene, but higher error rates have limited their attractiveness when accuracy is important. Here we present a high-throughput amplicon sequencing methodology based on PacBio circular consensus sequencing and the DADA2 sample inference method that measures the full-length 16S rRNA gene with single-nucleotide resolution and a near-zero error rate. In two artificial communities of known composition, our method recovered the full complement of full-length 16S sequence variants from expected community members without residual errors. The measured abundances of intra-genomic sequence variants were in the integral ratios expected from the genuine allelic variants within a genome. The full-length 16S gene sequences recovered by our approach allowed Escherichia coli strains to be correctly classified to the O157:H7 and K12 sub-species clades. In human fecal samples, our method showed strong technical replication and was able to recover the full complement of 16S rRNA alleles in several E. coli strains. There are likely many applications beyond microbial profiling for which high-throughput amplicon sequencing of complete genes with single-nucleotide resolution will be of use. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020  |  

Effector gene reshuffling involves dispensable mini-chromosomes in the wheat blast fungus.

Newly emerged wheat blast disease is a serious threat to global wheat production. Wheat blast is caused by a distinct, exceptionally diverse lineage of the fungus causing rice blast disease. Through sequencing a recent field isolate, we report a reference genome that includes seven core chromosomes and mini-chromosome sequences that harbor effector genes normally found on ends of core chromosomes in other strains. No mini-chromosomes were observed in an early field strain, and at least two from another isolate each contain different effector genes and core chromosome end sequences. The mini-chromosome is enriched in transposons occurring most frequently at core chromosome ends. Additionally, transposons in mini-chromosomes lack the characteristic signature for inactivation by repeat-induced point (RIP) mutation genome defenses. Our results, collectively, indicate that dispensable mini-chromosomes and core chromosomes undergo divergent evolutionary trajectories, and mini-chromosomes and core chromosome ends are coupled as a mobile, fast-evolving effector compartment in the wheat pathogen genome.


April 21, 2020  |  

Genomic inversions and GOLGA core duplicons underlie disease instability at the 15q25 locus.

Human chromosome 15q25 is involved in several disease-associated structural rearrangements, including microdeletions and chromosomal markers with inverted duplications. Using comparative fluorescence in situ hybridization, strand-sequencing, single-molecule, real-time sequencing and Bionano optical mapping analyses, we investigated the organization of the 15q25 region in human and nonhuman primates. We found that two independent inversions occurred in this region after the fission event that gave rise to phylogenetic chromosomes XIV and XV in humans and great apes. One of these inversions is still polymorphic in the human population today and may confer differential susceptibility to 15q25 microdeletions and inverted duplications. The inversion breakpoints map within segmental duplications containing core duplicons of the GOLGA gene family and correspond to the site of an ancestral centromere, which became inactivated about 25 million years ago. The inactivation of this centromere likely released segmental duplications from recombination repression typical of centromeric regions. We hypothesize that this increased the frequency of ectopic recombination creating a hotspot of hominid inversions where dispersed GOLGA core elements now predispose this region to recurrent genomic rearrangements associated with disease.


April 21, 2020  |  

Multi-omics characterization of the necrotrophic mycoparasite Saccharomycopsis schoenii.

Pathogenic yeasts and fungi are an increasing global healthcare burden, but discovery of novel antifungal agents is slow. The mycoparasitic yeast Saccharomycopsis schoenii was recently demonstrated to be able to kill the emerging multi-drug resistant yeast pathogen Candida auris. However, the molecular mechanisms involved in the predatory activity of S. schoenii have not been explored. To this end, we de novo sequenced, assembled and annotated a draft genome of S. schoenii. Using proteomics, we confirmed that Saccharomycopsis yeasts have reassigned the CTG codon and translate CTG into serine instead of leucine. Further, we confirmed an absence of all genes from the sulfate assimilation pathway in the genome of S. schoenii, and detected the expansion of several gene families, including aspartic proteases. Using Saccharomyces cerevisiae as a model prey cell, we honed in on the timing and nutritional conditions under which S. schoenii kills prey cells. We found that a general nutrition limitation, not a specific methionine deficiency, triggered predatory activity. Nevertheless, by means of genome-wide transcriptome analysis we observed dramatic responses to methionine deprivation, which were alleviated when S. cerevisiae was available as prey, and therefore postulate that S. schoenii acquired methionine from its prey cells. During predation, both proteomic and transcriptomic analyses revealed that S. schoenii highly upregulated and translated aspartic protease genes, probably used to break down prey cell walls. With these fundamental insights into the predatory behavior of S. schoenii, we open up for further exploitation of this yeast as a biocontrol yeast and/or source for novel antifungal agents.


April 21, 2020  |  

Capacity to utilize raffinose dictates pneumococcal disease phenotype.

Streptococcus pneumoniae is commonly carried asymptomatically in the human nasopharynx, but it also causes serious and invasive diseases such as pneumonia, bacteremia, and meningitis, as well as less serious but highly prevalent infections such as otitis media. We have previously shown that closely related pneumococci (of the same capsular serotype and multilocus sequence type [ST]) can display distinct pathogenic profiles in mice that correlate with clinical isolation site (e.g., blood versus ear), suggesting stable niche adaptation within a clonal lineage. This has provided an opportunity to identify determinants of disease tropism. Genomic analysis identified 17 and 27 single nucleotide polymorphisms (SNPs) or insertions/deletions in protein coding sequences between blood and ear isolates of serotype 14 ST15 and serotype 3 ST180, respectively. SNPs in raffinose uptake and utilization genes (rafR or rafK) were detected in both serotypes/lineages. Ear isolates were consistently defective in growth in media containing raffinose as the sole carbon source, as well as in expression of raffinose pathway genes aga, rafG, and rafK, relative to their serotype/ST-matched blood isolates. Similar differences were also seen between serotype 23F ST81 blood and ear isolates. Analysis of rafR allelic exchange mutants of the serotype 14 ST15 blood and ear isolates demonstrated that the SNP in rafR was entirely responsible for their distinct in vitro phenotypes and was also the determinant of differential tropism for the lungs versus ear and brain in a mouse intranasal challenge model. These data suggest that the ability of pneumococci to utilize raffinose determines the nature of disease.IMPORTANCES. pneumoniae is a component of the commensal nasopharyngeal microflora of humans, but from this reservoir, it can progress to localized or invasive disease with a frequency that translates into massive global morbidity and mortality. However, the factors that govern the switch from commensal to pathogen, as well as those that determine disease tropism, are poorly understood. Here we show that capacity to utilize raffinose can determine the nature of the disease caused by a given pneumococcal strain. Moreover, our findings provide an interesting example of convergent evolution, whereby pneumococci belonging to two unrelated serotypes/lineages exhibit SNPs in separate genes affecting raffinose uptake and utilization that correlate with distinct pathogenic profiles in vivo This further underscores the critical role of differential carbohydrate metabolism in the pathogenesis of localized versus invasive pneumococcal disease. Copyright © 2019 Minhas et al.


April 21, 2020  |  

Tissue specific alpha-2-Macroglobulin (A2M) splice isoform diversity in Hilsa shad, Tenualosa ilisha (Hamilton, 1822).

The present study, for the first time, reported twelve A2M isoforms in Tenualosa ilisha, through SMRT sequencing. Hilsa shad, T. ilisha, an anadromous fish, faces environmental stresses and is thus prone to diseases. Here, expression profiles of different A2M isoforms in four tissues were studied in T. ilisha, for the tissue specific diversity of A2M. Large scale high quality full length transcripts (>0.99% accuracy) were obtained from liver, ovary, testes and gill transcriptomes, through Iso-sequencing on PacBio RSII. A total of 12 isoforms, with complete putatative proteins, were detected in three tissues (7 isoforms in liver, 4 in ovary and 1 in testes). Complete structure of A2M mRNA was predicted from these isoforms, containing 4680 bp sequence, 35 exons and 1508 amino acids. With Homo sapiens A2M as reference, six functional domains (A2M_N,A2M_N2, A2M, Thiol-ester_cl, Complement and Receptor domain), along with a bait region, were predicted in A2M consensus protein. A total of 35 splice sites were identified in T. ilisha A2M consensus transcript, with highest frequency (55.7%) of GT-AG splice sites, as compared to that of Homo sapiens. Liver showed longest isoform (X1) consisting of all domains, while smallest (X10) was found in ovary with one Receptor domain. Present study predicted five putative markers (I-212, I-269, A-472, S-567 and Y-906) for EUS disease resistance in A2M protein, which were present in MG2 domains (A2M_N and A2M_N2), by comparing with that of resistant and susceptible/unknown response species. These markers classified fishes into two groups, resistant and susceptible response. Potential markers, predicted in T. ilisha, placed it to be EUS susceptible category. Putative markers reported in A2M protein may serve as molecular markers in diagnosis of EUS disease resistance/susceptibility in fishes and may have a potential for inclusion in the marker panel for pilot studies. Further, challenging studies are required to confirm the role of particular A2M isoforms and markers identified in immune protection against EUS disease.


April 21, 2020  |  

The bacteriocin from the prophylactic candidate Streptococcus suis 90-1330 is widely distributed across S. suis isolates and appears encoded in an integrative and conjugative element.

The Gram-positive a-hemolytic Streptococcus suis is a major pathogen in the swine industry and an emerging zoonotic agent that can cause several systemic issues in both pigs and humans. A total of 35 S. suis serotypes (SS) have been identified and genotyped into > 700 sequence types (ST) by multilocus sequence typing (MLST). Eurasian ST1 isolates are the most virulent of all S. suis SS2 strains while North American ST25 and ST28 strains display moderate to low/no virulence phenotypes, respectively. Notably, S. suis 90-1330 is an avirulent Canadian SS2-ST28 isolate producing a lantibiotic bacteriocin with potential prophylactic applications. To investigate the suitability of this strain for such purposes, we sequenced its complete genome using the Illumina and PacBio platforms. The S. suis 90-1330 bacteriocin was found encoded in a locus cargoed in what appears to be an integrative and conjugative element (ICE). This bacteriocin locus was also found to be widely distributed across several streptococcal species and in a few Staphylococcus aureus strains. Because the locus also confers protection from the bacteriocin, the potential prophylactic benefits of using this strain may prove limited due to the spread of the resistance to its effects. Furthermore, the S. suis 90-1330 genome was found to code for genes involved in blood survival, suggesting that strain may not be a benign as previously thought.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.