Menu
July 19, 2019

From short reads to chromosome-scale genome assemblies.

A high-quality, annotated genome assembly is the foundation for many downstream studies. However, obtaining such an assembly is a complex, reiterative process that requires the assimilation of high-quality data and combines different approaches and data types. While some software packages incorporating multiple steps of genome assembly are commercially available, they may not be flexible enough to be routinely applied to all organisms, particularly to nonmodel species such as pathogenic oomycetes and fungi. If researchers understand and apply the most appropriate, currently available tools for each step, it is possible to customize parameters and optimize results for their organism of study. Based on our experience of de novo assembly and annotation of several oomycete species, this chapter provides a modular workflow from processing of raw reads, to initial assembly generation, through optimization, chromosome-scale scaffolding and annotation, outlining input and output data as well as examples and alternative software used for each step. The accompanying Notes provide background information for each step as well as alternative options. The final result of this workflow could be an annotated, high-quality, validated, chromosome-scale assembly or a draft assembly of sufficient quality to meet specific needs of a project.


July 19, 2019

De novo assembly of haplotype-resolved genomes with trio binning.

Complex allelic variation hampers the assembly of haplotype-resolved sequences from diploid genomes. We developed trio binning, an approach that simplifies haplotype assembly by resolving allelic variation before assembly. In contrast with prior approaches, the effectiveness of our method improved with increasing heterozygosity. Trio binning uses short reads from two parental genomes to first partition long reads from an offspring into haplotype-specific sets. Each haplotype is then assembled independently, resulting in a complete diploid reconstruction. We used trio binning to recover both haplotypes of a diploid human genome and identified complex structural variants missed by alternative approaches. We sequenced an F1 cross between the cattle subspecies Bos taurus taurus and Bos taurus indicus and completely assembled both parental haplotypes with NG50 haplotig sizes of >20 Mb and 99.998% accuracy, surpassing the quality of current cattle reference genomes. We suggest that trio binning improves diploid genome assembly and will facilitate new studies of haplotype variation and inheritance.


July 19, 2019

Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L.

Modern sugarcanes are polyploid interspecific hybrids, combining high sugar content from Saccharum officinarum with hardiness, disease resistance and ratooning of Saccharum spontaneum. Sequencing of a haploid S. spontaneum, AP85-441, facilitated the assembly of 32 pseudo-chromosomes comprising 8 homologous groups of 4 members each, bearing 35,525 genes with alleles defined. The reduction of basic chromosome number from 10 to 8 in S. spontaneum was caused by fissions of 2 ancestral chromosomes followed by translocations to 4 chromosomes. Surprisingly, 80% of nucleotide binding site-encoding genes associated with disease resistance are located in 4 rearranged chromosomes and 51% of those in rearranged regions. Resequencing of 64 S. spontaneum genomes identified balancing selection in rearranged regions, maintaining their diversity. Introgressed S. spontaneum chromosomes in modern sugarcanes are randomly distributed in AP85-441 genome, indicating random recombination among homologs in different S. spontaneum accessions. The allele-defined Saccharum genome offers new knowledge and resources to accelerate sugarcane improvement.


July 19, 2019

Improved reference genome of Aedes aegypti informs arbovirus vector control.

Female Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of mosquitoes and developing the tools to fight them has been slowed by the lack of a high-quality genome assembly. Here we combine diverse technologies to produce the markedly improved, fully re-annotated AaegL5 genome assembly, and demonstrate how it accelerates mosquito science. We anchored physical and cytogenetic maps, doubled the number of known chemosensory ionotropic receptors that guide mosquitoes to human hosts and egg-laying sites, provided further insight into the size and composition of the sex-determining M locus, and revealed copy-number variation among glutathione S-transferase genes that are important for insecticide resistance. Using high-resolution quantitative trait locus and population genomic analyses, we mapped new candidates for dengue vector competence and insecticide resistance. AaegL5 will catalyse new biological insights and intervention strategies to fight this deadly disease vector.


July 19, 2019

The Dominant and Poorly Penetrant Phenotypes of Maize Unstable factor for orange1 Are Caused by DNA Methylation Changes at a Linked Transposon.

The maize (Zea mays) mutant Unstable factor for orange1 (Ufo1) has been implicated in the epigenetic modifications of pericarp color1 (p1), which regulates the production of the flavonoid pigments phlobaphenes. Here, we show that the ufo1 gene maps to a genetically recalcitrant region near the centromere of chromosome 10. Transcriptome analysis of Ufo1-1 mutant and wild-type plants identified a candidate gene in the mapping region using a comparative sequence-based approach. The candidate gene, GRMZM2G053177, is overexpressed by >45-fold in multiple tissues of Ufo1-1, explaining the dominance of Ufo1-1 and its phenotypes. In the mutant stock, GRMZM2G053177 has a unique transcript originating within a CACTA transposon inserted in its first intron, and it is missing the first four codons of the wild-type transcript. GRMZM2G053177 expression is regulated by the DNA methylation status of the CACTA transposon, explaining the incomplete penetrance and poor expressivity of Ufo1-1 Transgenic overexpression lines of GRMZM2G053177 (Ufo1-1) phenocopy the p1-induced pigmentation in coleoptiles, tassels, leaf sheaths, husks, pericarps, and cob glumes. Transcriptome analysis of Ufo1 versus wild-type tissues revealed changes in several pathways related to abiotic and biotic stress. Thus, this study addresses the enigma of Ufo1 identity in maize, which had gone unsolved for more than 50 years.© 2018 American Society of Plant Biologists. All rights reserved.


July 7, 2019

Genome sequence of Serratia nematodiphila DSM 21420T, a symbiotic bacterium from entomopathogenic nematode.

Serratia nematodiphila DSM 21420(T) (=CGMCC 1.6853(T), DZ0503SBS1(T)), isolated from the intestine of Heterorhabditidoides chongmingensis, has been known to have symbiotic-pathogenic life cycle, on the multilateral relationships with entomopathogenic nematode and insect pest. In order to better understanding of this rare feature in Serratia species, we present here the genome sequence of S. nematodiphila DSM 21420(T) with the significance of first genome sequence in this species. Copyright © 2014 Elsevier B.V. All rights reserved.


July 7, 2019

Construction of a reference genetic map of Raphanus sativus based on genotyping by whole-genome resequencing.

This manuscript provides a genetic map of Raphanus sativus that has been used as a reference genetic map for an ongoing genome sequencing project. The map was constructed based on genotyping by whole-genome resequencing of mapping parents and F 2 population. Raphanus sativus is an annual vegetable crop species of the Brassicaceae family and is one of the key plants in the seed industry, especially in East Asia. Assessment of the R. sativus genome provides fundamental resources for crop improvement as well as the study of crop genome structure and evolution. With the goal of anchoring genome sequence assemblies of R. sativus cv. WK10039 whose genome has been sequenced onto the chromosomes, we developed a reference genetic map based on genotyping of two parents (maternal WK10039 and paternal WK10024) and 93 individuals of the F2 mapping population by whole-genome resequencing. To develop high-confidence genetic markers, ~83 Gb of parental lines and ~591 Gb of mapping population data were generated as Illumina 100 bp paired-end reads. High stringent sequence analysis of the reads mapped to the 344 Mb of genome sequence scaffolds identified a total of 16,282 SNPs and 150 PCR-based markers. Using a subset of the markers, a high-density genetic map was constructed from the analysis of 2,637 markers spanning 1,538 cM with 1,000 unique framework loci. The genetic markers integrated 295 Mb of genome sequences to the cytogenetically defined chromosome arms. Comparative analysis of the chromosome-anchored sequences with Arabidopsis thaliana and Brassica rapa revealed that the R. sativus genome has evident triplicated sub-genome blocks and the structure of gene space is highly similar to that of B. rapa. The genetic map developed in this study will serve as fundamental genomic resources for the study of R. sativus.


July 7, 2019

Molecular characterization of plasmid pMoma1of Moraxella macacae, a newly described bacterial pathogen of macaques.

We report the complete nucleotide sequence and characterization of a small cryptic plasmid of Moraxella macacae 0408225, a newly described bacterial species within the family Moraxellaceae and a causative agent of epistaxis in macaques. The complete nucleotide sequence of the plasmid pMoma1 was determined and found to be 5,375 bp in size with a GC content of 37.4 %. Computer analysis of the sequence data revealed five open reading frames encoding putative proteins of 54.4 kDa (ORF1), 17.6 kDa (ORF2), 13.3 kDa (ORF3), 51.6 kDa (ORF4), and 25.0 kDa (ORF5). ORF1, ORF2, and ORF3 encode putative proteins with high identity (72, 42, and 55 %, respectively) to mobilization proteins of plasmids found in other Moraxella species. ORF3 encodes a putative protein with similarity (about 40 %) to several plasmid replicase (RepA) proteins. The fifth open reading frames (ORF) was most similar to hypothetical proteins with unknown functions, although domain analysis of this sequence suggests it belongs to the Abi-like protein family. Upstream of the repA gene, a 470-bp intergenic region, was identified that contained an AT-rich section and two sets of tandem direct and indirect repeats, consistent with a putative origin of replication site. In contrast to other plasmids of Moraxella, the occurrence of pMoma1 in M. macacae isolates appears to be common as PCR testing of 14 clinical isolates from two different research institutions all contained the plasmid.


July 7, 2019

Prevalence of subtilase cytotoxin-encoding subAB variants among Shiga toxin-producing Escherichia coli strains isolated from wild ruminants and sheep differs from that of cattle and pigs and is predominated by the new allelic variant subAB2-2.

Subtilase cytotoxin (SubAB) is an AB5 toxin produced by Shiga toxin (Stx)-producing Escherichia coli (STEC) strains usually lacking the eae gene product intimin. Three allelic variants of SubAB encoding genes have been described: subAB1, located on a plasmid, subAB2-1, located on the pathogenicity island SE-PAI and subAB2-2 located in an outer membrane efflux protein (OEP) region. SubAB is becoming increasingly recognized as a toxin potentially involved in human pathogenesis. Ruminants and cattle have been identified as reservoirs of subAB-positive STEC. The presence of the three subAB allelic variants was investigated by PCR for 152 STEC strains originating from chamois, ibex, red deer, roe deer, cattle, sheep and pigs. Overall, subAB genes were detected in 45.5% of the strains. Prevalence was highest for STEC originating from ibex (100%), chamois (92%) and sheep (65%). None of the STEC of bovine or of porcine origin tested positive for subAB. None of the strains tested positive for subAB1. The allelic variant subAB2-2 was detected the most commonly, with 51.4% possessing subAb2-1 together with subAB2-2. STEC of ovine origin, serotypes O91:H- and O128:H2, the saa gene, which encodes for the autoagglutinating adhesin and stx2b were significantly associated with subAB-positive STEC. Our results suggest that subAB2-1 and subAB2-2 is widespread among STEC from wild ruminants and sheep and may be important as virulence markers in STEC pathogenic to humans. Copyright © 2014 Elsevier GmbH. All rights reserved.


July 7, 2019

Burkholderia pseudomallei sequencing identifies genomic clades with distinct recombination, accessory, and epigenetic profiles.

Burkholderia pseudomallei (Bp) is the causative agent of the infectious disease melioidosis. To investigate population diversity, recombination, and horizontal gene transfer in closely related Bp isolates, we performed whole-genome sequencing (WGS) on 106 clinical, animal, and environmental strains from a restricted Asian locale. Whole-genome phylogenies resolved multiple genomic clades of Bp, largely congruent with multilocus sequence typing (MLST). We discovered widespread recombination in the Bp core genome, involving hundreds of regions associated with multiple haplotypes. Highly recombinant regions exhibited functional enrichments that may contribute to virulence. We observed clade-specific patterns of recombination and accessory gene exchange, and provide evidence that this is likely due to ongoing recombination between clade members. Reciprocally, interclade exchanges were rarely observed, suggesting mechanisms restricting gene flow between clades. Interrogation of accessory elements revealed that each clade harbored a distinct complement of restriction-modification (RM) systems, predicted to cause clade-specific patterns of DNA methylation. Using methylome sequencing, we confirmed that representative strains from separate clades indeed exhibit distinct methylation profiles. Finally, using an E. coli system, we demonstrate that Bp RM systems can inhibit uptake of non-self DNA. Our data suggest that RM systems borne on mobile elements, besides preventing foreign DNA invasion, may also contribute to limiting exchanges of genetic material between individuals of the same species. Genomic clades may thus represent functional units of genetic isolation in Bp, modulating intraspecies genetic diversity. © 2015 Nandi et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Late pleistocene Australian marsupial DNA clarifies the affinities of extinct megafaunal kangaroos and wallabies.

Understanding the evolution of Australia’s extinct marsupial megafauna has been hindered by a relatively incomplete fossil record and convergent or highly specialized morphology, which confound phylogenetic analyses. Further, the harsh Australian climate and early date of most megafaunal extinctions (39-52 ka) means that the vast majority of fossil remains are unsuitable for ancient DNA analyses. Here, we apply cross-species DNA capture to fossils from relatively high latitude, high altitude caves in Tasmania. Using low-stringency hybridization and high-throughput sequencing, we were able to retrieve mitochondrial sequences from two extinct megafaunal macropodid species. The two specimens, Simosthenurus occidentalis (giant short-faced kangaroo) and Protemnodon anak (giant wallaby), have been radiocarbon dated to 46-50 and 40-45 ka, respectively. This is significantly older than any Australian fossil that has previously yielded DNA sequence information. Processing the raw sequence data from these samples posed a bioinformatic challenge due to the poor preservation of DNA. We explored several approaches in order to maximize the signal-to-noise ratio in retained sequencing reads. Our findings demonstrate the critical importance of adopting stringent processing criteria when distant outgroups are used as references for mapping highly fragmented DNA. Based on the most stringent nucleotide data sets (879 bp for S. occidentalis and 2,383 bp for P. anak), total-evidence phylogenetic analyses confirm that macropodids consist of three primary lineages: Sthenurines such as Simosthenurus (extinct short-faced kangaroos), the macropodines (all other wallabies and kangaroos), and the enigmatic living banded hare-wallaby Lagostrophus fasciatus (Lagostrophinae). Protemnodon emerges as a close relative of Macropus (large living kangaroos), a position not supported by recent morphological phylogenetic analyses. © The Authors 2014. Published by Oxford University Press on behalf of Molecular Biology and Evolution. All rights reserved. For Permissions, please email: journals.permissions@oup.com.


July 7, 2019

Finished genome sequence of Collimonas arenae Cal35.

We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of genome content and synteny among collimonads. Copyright © 2015 Wu et al.


July 7, 2019

Nonribosomal peptide synthase gene clusters for lipopeptide biosynthesis in Bacillus subtilis 916 and their phenotypic functions.

Bacillus cyclic lipopeptides (LPs) have been well studied for their phytopathogen-antagonistic activities. Recently, research has shown that these LPs also contribute to the phenotypic features of Bacillus strains, such as hemolytic activity, swarming motility, biofilm formation, and colony morphology. Bacillus subtilis 916 not only coproduces the three families of well-known LPs, i.e., surfactins, bacillomycin Ls (iturin family), and fengycins, but also produces a new family of LP called locillomycins. The genome of B. subtilis 916 contains four nonribosomal peptide synthase (NRPS) gene clusters, srf, bmy, fen, and loc, which are responsible for the biosynthesis of surfactins, bacillomycin Ls, fengycins, and locillomycins, respectively. By studying B. subtilis 916 mutants lacking production of one, two, or three LPs, we attempted to unveil the connections between LPs and phenotypic features. We demonstrated that bacillomycin Ls and fengycins contribute mainly to antifungal activity. Although surfactins have weak antifungal activity in vitro, the strain mutated in srfAA had significantly decreased antifungal activity. This may be due to the impaired productions of fengycins and bacillomycin Ls. We also found that the disruption of any LP gene cluster other than fen resulted in a change in colony morphology. While surfactins and bacillomycin Ls play very important roles in hemolytic activity, swarming motility, and biofilm formation, the fengycins and locillomycins had little influence on these phenotypic features. In conclusion, B. subtilis 916 coproduces four families of LPs which contribute to the phenotypic features of B. subtilis 916 in an intricate way. Copyright © 2015, American Society for Microbiology. All Rights Reserved.


July 7, 2019

Strategies for optimizing algal biology for enhanced biomass production

One of the most environmentally sustainable ways to produce high-energy density (oils) feed stocks for the production of liquid transportation fuels is from biomass. Photosynthetic carbon capture combined with biomass combustion (point source) and subsequent carbon capture and sequestration has also been proposed in the intergovernmental panel on climate change report as one of the most effective and economical strategies to remediate atmospheric greenhouse gases. To maximize photosynthetic carbon capture efficiency and energy-return-on-investment, we must develop biomass production systems that achieve the greatest yields with the lowest inputs. Numerous studies have demonstrated that microalgae have among the greatest potentials for biomass production. This is in part due to the fact that all alga cells are photoautotrophic, they have active carbon concentrating mechanisms to increase photosynthetic productivity, and all the biomass is harvestable unlike plants. All photosynthetic organisms, however, convert only a fraction of the solar energy they capture into chemical energy (reduced carbon or biomass). To increase aerial carbon capture rates and biomass productivity, it will be necessary to identify the most robust algal strains and increase their biomass production efficiency often by genetic manipulation. We review recent large-scale efforts to identify the best biomass producing strains and metabolic engineering strategies to improve aerial productivity. These strategies include optimization of photosynthetic light-harvesting antenna size to increase energy capture and conversion efficiency and the potential development of advanced molecular breeding techniques. To date, these strategies have resulted in up to twofold increases in biomass productivity.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.