Menu
July 19, 2019

De novo assembly of two Swedish genomes reveals missing segments from the human GRCh38 reference and improves variant calling of population-scale sequencing data.

The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes that revealed over 10 Mb of sequences absent from the human GRCh38 reference in each individual. Around 6 Mb of these novel sequences (NS) are shared with a Chinese personal genome. The NS are highly repetitive, have an elevated GC-content, and are primarily located in centromeric or telomeric regions. Up to 1 Mb of NS can be assigned to chromosome Y, and large segments are also missing from GRCh38 at chromosomes 14, 17, and 21. Inclusion of NS into the GRCh38 reference radically improves the alignment and variant calling from short-read whole-genome sequencing data at several genomic loci. A re-analysis of a Swedish population-scale sequencing project yields > 75,000 putative novel single nucleotide variants (SNVs) and removes > 10,000 false positive SNV calls per individual, some of which are located in protein coding regions. Our results highlight that the GRCh38 reference is not yet complete and demonstrate that personal genome assemblies from local populations can improve the analysis of short-read whole-genome sequencing data.


July 19, 2019

Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L.

Modern sugarcanes are polyploid interspecific hybrids, combining high sugar content from Saccharum officinarum with hardiness, disease resistance and ratooning of Saccharum spontaneum. Sequencing of a haploid S. spontaneum, AP85-441, facilitated the assembly of 32 pseudo-chromosomes comprising 8 homologous groups of 4 members each, bearing 35,525 genes with alleles defined. The reduction of basic chromosome number from 10 to 8 in S. spontaneum was caused by fissions of 2 ancestral chromosomes followed by translocations to 4 chromosomes. Surprisingly, 80% of nucleotide binding site-encoding genes associated with disease resistance are located in 4 rearranged chromosomes and 51% of those in rearranged regions. Resequencing of 64 S. spontaneum genomes identified balancing selection in rearranged regions, maintaining their diversity. Introgressed S. spontaneum chromosomes in modern sugarcanes are randomly distributed in AP85-441 genome, indicating random recombination among homologs in different S. spontaneum accessions. The allele-defined Saccharum genome offers new knowledge and resources to accelerate sugarcane improvement.


July 19, 2019

Improved reference genome of Aedes aegypti informs arbovirus vector control.

Female Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of mosquitoes and developing the tools to fight them has been slowed by the lack of a high-quality genome assembly. Here we combine diverse technologies to produce the markedly improved, fully re-annotated AaegL5 genome assembly, and demonstrate how it accelerates mosquito science. We anchored physical and cytogenetic maps, doubled the number of known chemosensory ionotropic receptors that guide mosquitoes to human hosts and egg-laying sites, provided further insight into the size and composition of the sex-determining M locus, and revealed copy-number variation among glutathione S-transferase genes that are important for insecticide resistance. Using high-resolution quantitative trait locus and population genomic analyses, we mapped new candidates for dengue vector competence and insecticide resistance. AaegL5 will catalyse new biological insights and intervention strategies to fight this deadly disease vector.


July 7, 2019

Broad CTL response is required to clear latent HIV-1 due to dominance of escape mutations.

Despite antiretroviral therapy (ART), human immunodeficiency virus (HIV)-1 persists in a stable latent reservoir, primarily in resting memory CD4(+) T cells. This reservoir presents a major barrier to the cure of HIV-1 infection. To purge the reservoir, pharmacological reactivation of latent HIV-1 has been proposed and tested both in vitro and in vivo. A key remaining question is whether virus-specific immune mechanisms, including cytotoxic T lymphocytes (CTLs), can clear infected cells in ART-treated patients after latency is reversed. Here we show that there is a striking all or none pattern for CTL escape mutations in HIV-1 Gag epitopes. Unless ART is started early, the vast majority (>98%) of latent viruses carry CTL escape mutations that render infected cells insensitive to CTLs directed at common epitopes. To solve this problem, we identified CTLs that could recognize epitopes from latent HIV-1 that were unmutated in every chronically infected patient tested. Upon stimulation, these CTLs eliminated target cells infected with autologous virus derived from the latent reservoir, both in vitro and in patient-derived humanized mice. The predominance of CTL-resistant viruses in the latent reservoir poses a major challenge to viral eradication. Our results demonstrate that chronically infected patients retain a broad-spectrum viral-specific CTL response and that appropriate boosting of this response may be required for the elimination of the latent reservoir.


July 7, 2019

In-depth determination and analysis of the human paired heavy- and light-chain antibody repertoire.

High-throughput immune repertoire sequencing has emerged as a critical step in the understanding of adaptive responses following infection or vaccination or in autoimmunity. However, determination of native antibody variable heavy-light pairs (VH-VL pairs) remains a major challenge, and no technologies exist to adequately interrogate the >1 × 10(6) B cells in typical specimens. We developed a low-cost, single-cell, emulsion-based technology for sequencing antibody VH-VL repertoires from >2 × 10(6) B cells per experiment with demonstrated pairing precision >97%. A simple flow-focusing apparatus was used to sequester single B cells into emulsion droplets containing lysis buffer and magnetic beads for mRNA capture; subsequent emulsion RT-PCR generated VH-VL amplicons for next-generation sequencing. Massive VH-VL repertoire analyses of three human donors provided new immunological insights including (i) the identity, frequency and pairing propensity of shared, or ‘public’, VL genes, (ii) the detection of allelic inclusion (an implicated autoimmune mechanism) in healthy individuals and (iii) the occurrence of antibodies with features, in terms of gene usage and CDR3 length, associated with broadly neutralizing antibodies to rapidly evolving viruses such as HIV-1 and influenza.


July 7, 2019

Characterization of the effect of the histidine kinase CovS on response regulator phosphorylation in group A Streptococcus.

Two-component gene regulatory systems (TCSs) are a major mechanism by which bacteria respond to environmental stimuli and thus are critical to infectivity. For example, the control of virulence regulator/sensor kinase (CovRS) TCS is central to the virulence of the major human pathogen group A Streptococcus (GAS). Here, we used a combination of quantitative in vivo phosphorylation assays, isoallelic strains that varied by only a single amino acid in CovS, and transcriptome analyses to characterize the impact of CovS on CovR phosphorylation and GAS global gene expression. We discovered that CovS primarily serves to phosphorylate CovR, thereby resulting in the repression of virulence factor-encoding genes. However, a GAS strain selectively deficient in CovS phosphatase activity had a distinct transcriptome relative to that of its parental strain, indicating that both CovS kinase and phosphatase activities influence the CovR phosphorylation status. Surprisingly, compared to a serotype M3 strain, serotype M1 GAS strains had high levels of phosphorylated CovR, low transcript levels of CovR-repressed genes, and strikingly different responses to environmental cues. Moreover, the inactivation of CovS in the serotype M1 background resulted in a greater decrease in phosphorylated CovR levels and a greater increase in the transcript levels of CovR-repressed genes than did CovS inactivation in a serotype M3 strain. These data clarify the influence of CovS on the CovR phosphorylation status and provide insight into why serotype M1 GAS strains have high rates of spontaneous mutations in covS during invasive GAS infection, thus providing a link between TCS molecular function and the epidemiology of deadly bacterial infections. Copyright © 2015, American Society for Microbiology. All Rights Reserved.


July 7, 2019

Complete sequences of six IncA/C plasmids of multidrug-resistant Salmonella enterica subsp. enterica serotype Newport.

Multidrug-resistant (MDR) Salmonella enterica subsp. enterica serotype Newport has been a long-standing public health concern in the United States. We present the complete sequences of six IncA/C plasmids from animal-derived MDR S. Newport ranging from 80.1 to 158.5 kb. They shared a genetic backbone with S. Newport IncA/C plasmids pSN254 and pAM04528. Copyright © 2015 Cao et al.


July 7, 2019

Biochemical characterization of a Naegleria TET-like oxygenase and its application in single molecule sequencing of 5-methylcytosine.

Modified DNA bases in mammalian genomes, such as 5-methylcytosine ((5m)C) and its oxidized forms, are implicated in important epigenetic regulation processes. In human or mouse, successive enzymatic conversion of (5m)C to its oxidized forms is carried out by the ten-eleven translocation (TET) proteins. Previously we reported the structure of a TET-like (5m)C oxygenase (NgTET1) from Naegleria gruberi, a single-celled protist evolutionarily distant from vertebrates. Here we show that NgTET1 is a 5-methylpyrimidine oxygenase, with activity on both (5m)C (major activity) and thymidine (T) (minor activity) in all DNA forms tested, and provide unprecedented evidence for the formation of 5-formyluridine ((5f)U) and 5-carboxyuridine ((5ca)U) in vitro. Mutagenesis studies reveal a delicate balance between choice of (5m)C or T as the preferred substrate. Furthermore, our results suggest substrate preference by NgTET1 to (5m)CpG and TpG dinucleotide sites in DNA. Intriguingly, NgTET1 displays higher T-oxidation activity in vitro than mammalian TET1, supporting a closer evolutionary relationship between NgTET1 and the base J-binding proteins from trypanosomes. Finally, we demonstrate that NgTET1 can be readily used as a tool in (5m)C sequencing technologies such as single molecule, real-time sequencing to map (5m)C in bacterial genomes at base resolution.


July 7, 2019

Insights on the emergence of Mycobacterium tuberculosis from the analysis of Mycobacterium kansasii.

By phylogenetic analysis, Mycobacterium kansasii is closely related to Mycobacterium tuberculosis. Yet, although both organisms cause pulmonary disease, M. tuberculosis is a global health menace, whereas M. kansasii is an opportunistic pathogen. To illuminate the differences between these organisms, we have sequenced the genome of M. kansasii ATCC 12478 and its plasmid (pMK12478) and conducted side-by-side in vitro and in vivo investigations of these two organisms. The M. kansasii genome is 6,432,277 bp, more than 2 Mb longer than that of M. tuberculosis H37Rv, and the plasmid contains 144,951 bp. Pairwise comparisons reveal conserved and discordant genes and genomic regions. A notable example of genomic conservation is the virulence locus ESX-1, which is intact and functional in the low-virulence M. kansasii, potentially mediating phagosomal disruption. Differences between these organisms include a decreased predicted metabolic capacity, an increased proportion of toxin-antitoxin genes, and the acquisition of M. tuberculosis-specific genes in the pathogen since their common ancestor. Consistent with their distinct epidemiologic profiles, following infection of C57BL/6 mice, M. kansasii counts increased by less than 10-fold over 6 weeks, whereas M. tuberculosis counts increased by over 10,000-fold in just 3 weeks. Together, these data suggest that M. kansasii can serve as an image of the environmental ancestor of M. tuberculosis before its emergence as a professional pathogen, and can be used as a model organism to study the switch from an environmental opportunistic pathogen to a professional host-restricted pathogen. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

Complete genome sequence of Mycoplasma flocculare strain Ms42T (ATCC 27399T).

Mycoplasma flocculare is a commensal or low-virulence pathogen of swine. The complete 778,866-bp genome sequence of M. flocculare strain Ms42(T) has been determined, enabling further comparison to genomes of the closely related pathogen Mycoplasma hyopneumoniae. The absence of the p97 and glpD genes may contribute to the attenuated virulence of M. flocculare. Copyright © 2015 Calcutt et al.


July 7, 2019

Complete genome sequence of Pseudomonas aeruginosa mucoid strain FRD1, isolated from a cystic fibrosis patient.

We announce here the complete genome sequence of the Pseudomonas aeruginosa mucoid strain FRD1, isolated from the sputum of a cystic fibrosis patient. The complete genome of P. aeruginosa FRD1 is 6,712,339 bp. This genome will allow comparative genomics to be used to identify genes associated with virulence, especially those involved in chronic pulmonary infections. Copyright © 2015 Silo-Suh et al.


July 7, 2019

Phylogeographical analysis of the dominant multidrug-resistant H58 clade of Salmonella Typhi identifies inter- and intracontinental transmission events.

The emergence of multidrug-resistant (MDR) typhoid is a major global health threat affecting many countries where the disease is endemic. Here whole-genome sequence analysis of 1,832 Salmonella enterica serovar Typhi (S. Typhi) identifies a single dominant MDR lineage, H58, that has emerged and spread throughout Asia and Africa over the last 30 years. Our analysis identifies numerous transmissions of H58, including multiple transfers from Asia to Africa and an ongoing, unrecognized MDR epidemic within Africa itself. Notably, our analysis indicates that H58 lineages are displacing antibiotic-sensitive isolates, transforming the global population structure of this pathogen. H58 isolates can harbor a complex MDR element residing either on transmissible IncHI1 plasmids or within multiple chromosomal integration sites. We also identify new mutations that define the H58 lineage. This phylogeographical analysis provides a framework to facilitate global management of MDR typhoid and is applicable to similar MDR lineages emerging in other bacterial species.


July 7, 2019

Draft genome sequence of Erwinia tracheiphila, an economically important bacterial pathogen of cucurbits.

Erwinia tracheiphila is one of the most economically important pathogens of cucumbers, melons, squashes, pumpkins, and gourds in the northeastern and midwestern United States, yet its molecular pathology remains uninvestigated. Here, we report the first draft genome sequence of an E. tracheiphila strain isolated from an infected wild gourd (Cucurbita pepo subsp. texana) plant. The genome assembly consists of 7 contigs and includes a putative plasmid and at least 20 phage and prophage elements. Copyright © 2015 Shapiro et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.