Menu
July 7, 2019  |  

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod.

The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies.By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual.The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.


July 7, 2019  |  

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.

The human reference genome assembly plays a central role in nearly all aspects of today’s basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health. © 2017 Schneider et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019  |  

Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism.

Plasmopara viticola causes downy mildew disease of grapevine which is one of the most devastating diseases of viticulture worldwide. Here we report a 101.3?Mb whole genome sequence of P. viticola isolate ‘JL-7-2’ obtained by a combination of Illumina and PacBio sequencing technologies. The P. viticola genome contains 17,014 putative protein-coding genes and has ~26% repetitive sequences. A total of 1,301 putative secreted proteins, including 100 putative RXLR effectors and 90 CRN effectors were identified in this genome. In the secretome, 261 potential pathogenicity genes and 95 carbohydrate-active enzymes were predicted. Transcriptional analysis revealed that most of the RXLR effectors, pathogenicity genes and carbohydrate-active enzymes were significantly up-regulated during infection. Comparative genomic analysis revealed that P. viticola evolved independently from the Arabidopsis downy mildew pathogen Hyaloperonospora arabidopsidis. The availability of the P. viticola genome provides a valuable resource not only for comparative genomic analysis and evolutionary studies among oomycetes, but also enhance our knowledge on the mechanism of interactions between this biotrophic pathogen and its host.


July 7, 2019  |  

Extremely low genomic diversity of Rickettsia japonica distributed in Japan.

Rickettsiae are obligate intracellular bacteria that have small genomes as a result of reductive evolution. Many Rickettsia species of the spotted fever group (SFG) cause tick-borne diseases known as “spotted fevers”. The life cycle of SFG rickettsiae is closely associated with that of the tick, which is generally thought to act as a bacterial vector and reservoir that maintains the bacterium through transstadial and transovarial transmission. Each SFG member is thought to have adapted to a specific tick species, thus restricting the bacterial distribution to a relatively limited geographic region. These unique features of SFG rickettsiae allow investigation of how the genomes of such biologically and ecologically specialized bacteria evolve after genome reduction and the types of population structures that are generated. Here, we performed a nationwide, high-resolution phylogenetic analysis of Rickettsia japonica, an etiological agent of Japanese spotted fever that is distributed in Japan and Korea. The comparison of complete or nearly complete sequences obtained from 31 R. japonica strains isolated from various sources in Japan over the past 30 years demonstrated an extremely low level of genomic diversity. In particular, only 34 single nucleotide polymorphisms were identified among the 27 strains of the major lineage containing all clinical isolates and tick isolates from the three tick species. Our data provide novel insights into the biology and genome evolution of R. japonica, including the possibilities of recent clonal expansion and a long generation time in nature due to the long dormant phase associated with tick life cycles.© The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019  |  

The Nephila clavipes genome highlights the diversity of spider silk genes and their complex expression.

Spider silks are the toughest known biological materials, yet are lightweight and virtually invisible to the human immune system, and they thus have revolutionary potential for medicine and industry. Spider silks are largely composed of spidroins, a unique family of structural proteins. To investigate spidroin genes systematically, we constructed the first genome of an orb-weaving spider: the golden orb-weaver (Nephila clavipes), which builds large webs using an extensive repertoire of silks with diverse physical properties. We cataloged 28 Nephila spidroins, representing all known orb-weaver spidroin types, and identified 394 repeated coding motif variants and higher-order repetitive cassette structures unique to specific spidroins. Characterization of spidroin expression in distinct silk gland types indicates that glands can express multiple spidroin types. We find evidence of an alternatively spliced spidroin, a spidroin expressed only in venom glands, evolutionary mechanisms for spidroin diversification, and non-spidroin genes with expression patterns that suggest roles in silk production.


July 7, 2019  |  

Genome sequencing and population genomic analyses provide insights into the adaptive landscape of silver birch.

Silver birch (Betula pendula) is a pioneer boreal tree that can be induced to flower within 1 year. Its rapid life cycle, small (440-Mb) genome, and advanced germplasm resources make birch an attractive model for forest biotechnology. We assembled and chromosomally anchored the nuclear genome of an inbred B. pendula individual. Gene duplicates from the paleohexaploid event were enriched for transcriptional regulation, whereas tandem duplicates were overrepresented by environmental responses. Population resequencing of 80 individuals showed effective population size crashes at major points of climatic upheaval. Selective sweeps were enriched among polyploid duplicates encoding key developmental and physiological triggering functions, suggesting that local adaptation has tuned the timing of and cross-talk between fundamental plant processes. Variation around the tightly-linked light response genes PHYC and FRS10 correlated with latitude and longitude and temperature, and with precipitation for PHYC. Similar associations characterized the growth-promoting cytokinin response regulator ARR1, and the wood development genes KAK and MED5A.


July 7, 2019  |  

High-quality draft genome sequences of four lignocellulose-degrading bacteria isolated from Puerto Rican forest soil: Gordonia sp., Paenibacillus sp., Variovorax sp., and Vogesella sp.

Here, we report the high-quality draft genome sequences of four phylogenetically diverse lignocellulose-degrading bacteria isolated from tropical soil (Gordonia sp., Paenibacillus sp., Variovorax sp., and Vogesella sp.) to elucidate the genetic basis of their ability to degrade lignocellulose. These isolates may provide novel enzymes for biofuel production. Copyright © 2017 Woo et al.


July 7, 2019  |  

No evidence for maintenance of a sympatric Heliconius species barrier by chromosomal inversions.

Mechanisms that suppress recombination are known to help maintain species barriers by preventing the breakup of coadapted gene combinations. The sympatric butterfly species Heliconius melpomene and Heliconius cydno are separated by many strong barriers, but the species still hybridize infrequently in the wild, and around 40% of the genome is influenced by introgression. We tested the hypothesis that genetic barriers between the species are maintained by inversions or other mechanisms that reduce between-species recombination rate. We constructed fine-scale recombination maps for Panamanian populations of both species and their hybrids to directly measure recombination rate within and between species, and generated long sequence reads to detect inversions. We find no evidence for a systematic reduction in recombination rates in F1 hybrids, and also no evidence for inversions longer than 50 kb that might be involved in generating or maintaining species barriers. This suggests that mechanisms leading to global or local reduction in recombination do not play a significant role in the maintenance of species barriers between H. melpomene and H. cydno.


July 7, 2019  |  

Characterization of four endophytic fungi as potential consolidated bioprocessing hosts for conversion of lignocellulose into advanced biofuels.

Recently, several endophytic fungi have been demonstrated to produce volatile organic compounds (VOCs) with properties similar to fossil fuels, called “mycodiesel,” while growing on lignocellulosic plant and agricultural residues. The fact that endophytes are plant symbionts suggests that some may be able to produce lignocellulolytic enzymes, making them capable of both deconstructing lignocellulose and converting it into mycodiesel, two properties that indicate that these strains may be useful consolidated bioprocessing (CBP) hosts for the biofuel production. In this study, four endophytes Hypoxylon sp. CI4A, Hypoxylon sp. EC38, Hypoxylon sp. CO27, and Daldinia eschscholzii EC12 were selected and evaluated for their CBP potential. Analysis of their genomes indicates that these endophytes have a rich reservoir of biomass-deconstructing carbohydrate-active enzymes (CAZys), which includes enzymes active on both polysaccharides and lignin, as well as terpene synthases (TPSs), enzymes that may produce fuel-like molecules, suggesting that they do indeed have CBP potential. GC-MS analyses of their VOCs when grown on four representative lignocellulosic feedstocks revealed that these endophytes produce a wide spectrum of hydrocarbons, the majority of which are monoterpenes and sesquiterpenes, including some known biofuel candidates. Analysis of their cellulase activity when grown under the same conditions revealed that these endophytes actively produce endoglucanases, exoglucanases, and ß-glucosidases. The richness of CAZymes as well as terpene synthases identified in these four endophytic fungi suggests that they are great candidates to pursue for development into platform CBP organisms.


July 7, 2019  |  

Coping with living in the soil: the genome of the parthenogenetic springtail Folsomia candida.

Folsomia candida is a model in soil biology, belonging to the family of Isotomidae, subclass Collembola. It reproduces parthenogenetically in the presence of Wolbachia, and exhibits remarkable physiological adaptations to stress. To better understand these features and adaptations to life in the soil, we studied its genome in the context of its parthenogenetic lifestyle.We applied Pacific Bioscience sequencing and assembly to generate a reference genome for F. candida of 221.7 Mbp, comprising only 162 scaffolds. The complete genome of its endosymbiont Wolbachia, was also assembled and turned out to be the largest strain identified so far. Substantial gene family expansions and lineage-specific gene clusters were linked to stress response. A large number of genes (809) were acquired by horizontal gene transfer. A substantial fraction of these genes are involved in lignocellulose degradation. Also, the presence of genes involved in antibiotic biosynthesis was confirmed. Intra-genomic rearrangements of collinear gene clusters were observed, of which 11 were organized as palindromes. The Hox gene cluster of F. candida showed major rearrangements compared to arthropod consensus cluster, resulting in a disorganized cluster.The expansion of stress response gene families suggests that stress defense was important to facilitate colonization of soils. The large number of HGT genes related to lignocellulose degradation could be beneficial to unlock carbohydrate sources in soil, especially those contained in decaying plant and fungal organic matter. Intra- as well as inter-scaffold duplications of gene clusters may be a consequence of its parthenogenetic lifestyle. This high quality genome will be instrumental for evolutionary biologists investigating deep phylogenetic lineages among arthropods and will provide the basis for a more mechanistic understanding in soil ecology and ecotoxicology.


July 7, 2019  |  

Repeated divergent selection on pigmentation genes in a rapid finch radiation.

Instances of recent and rapid speciation are suitable for associating phenotypes with their causal genotypes, especially if gene flow homogenizes areas of the genome that are not under divergent selection. We study a rapid radiation of nine sympatric bird species known as capuchino seedeaters, which are differentiated in sexually selected characters of male plumage and song. We sequenced the genomes of a phenotypically diverse set of species to search for differentiated genomic regions. Capuchinos show differences in a small proportion of their genomes, yet selection has acted independently on the same targets in different members of this radiation. Many divergent regions contain genes involved in the melanogenesis pathway, with the strongest signal originating from putative regulatory regions. Selection has acted on these same genomic regions in different lineages, likely shaping the evolution of cis-regulatory elements, which control how more conserved genes are expressed and thereby generate diversity in classically sexually selected traits.


July 7, 2019  |  

Whole genome sequencing predicts novel human disease models in rhesus macaques.

Rhesus macaques are an important pre-clinical model of human disease. To advance our understanding of genomic variation that may influence disease, we surveyed genome-wide variation in 21 rhesus macaques. We employed best-practice variant calling, validated with Mendelian inheritance. Next, we used alignment data from our cohort to detect genomic regions likely to produce inaccurate genotypes, potentially due to either gene duplication or structural variation between individuals. We generated a final dataset of >16 million high confidence variants, including 13 million in Chinese-origin rhesus macaques, an increasingly important disease model. We detected an average of 131 mutations predicted to severely alter protein coding per animal, and identified 45 such variants that coincide with known pathogenic human variants. These data suggest that expanded screening of existing breeding colonies will identify novel models of human disease, and that increased genomic characterization can help inform research studies in macaques. Copyright © 2017 Elsevier Inc. All rights reserved.


July 7, 2019  |  

Complete genome sequence of the nematicidal Bacillus thuringiensis MYBT18246.

Bacillus thuringiensis is a rod-shaped facultative anaerobic spore forming bacterium of the genus Bacillus . The defining feature of the species is the ability to produce parasporal crystal inclusion bodies, consisting of d-endotoxins, encoded by cry-genes. Here we present the complete annotated genome sequence of the nematicidal B. thuringiensis strain MYBT18246. The genome comprises one 5,867,749 bp chromosome and 11 plasmids which vary in size from 6330 bp to 150,790 bp. The chromosome contains 6092 protein-coding and 150 RNA genes, including 36 rRNA genes. The plasmids encode 997 proteins and 4 t-RNA’s. Analysis of the genome revealed a large number of mobile elements involved in genome plasticity including 11 plasmids and 16 chromosomal prophages. Three different nematicidal toxin genes were identified and classified according to the Cry toxin naming committee as cry13Aa2, cry13Ba1, and cry13Ab1. Strikingly, these genes are located on the chromosome in close proximity to three separate prophages. Moreover, four putative toxin genes of different toxin classes were identified on the plasmids p120510 (Vip-like toxin), p120416 (Cry-like toxin) and p109822 (two Bin-like toxins). A comparative genome analysis of B. thuringiensis MYBT18246 with three closely related B. thuringiensis strains enabled determination of the pan-genome of B. thuringiensis MYBT18246, revealing a large number of singletons, mostly represented by phage genes, morons and cryptic genes.


July 7, 2019  |  

High-quality genome sequence of the radioresistant bacterium Deinococcus ficus KS 0460.

The genetic platforms of Deinococcus species remain the only systems in which massive ionizing radiation (IR)-induced genome damage can be investigated in vivo at exposures commensurate with cellular survival. We report the whole genome sequence of the extremely IR-resistant rod-shaped bacterium Deinococcus ficus KS 0460 and its phenotypic characterization. Deinococcus ficus KS 0460 has been studied since 1987, first under the name Deinobacter grandis, then Deinococcus grandis. The D. ficus KS 0460 genome consists of a 4.019 Mbp sequence (69.7% GC content and 3894 predicted genes) divided into six genome partitions, five of which are confirmed to be circular. Circularity was determined manually by mate pair linkage. Approximately 76% of the predicted proteins contained identifiable Pfam domains and 72% were assigned to COGs. Of all D. ficus KS 0460 proteins, 79% and 70% had homologues in Deinococcus radiodurans ATCC BAA-816 and Deinococcus geothermalis DSM 11300, respectively. The most striking differences between D. ficus KS 0460 and D. radiodurans BAA-816 identified by the comparison of the KEGG pathways were as follows: (i) D. ficus lacks nine enzymes of purine degradation present in D. radiodurans, and (ii) D. ficus contains eight enzymes involved in nitrogen metabolism, including nitrate and nitrite reductases, that D. radiodurans lacks. Moreover, genes previously considered to be important to IR resistance are missing in D. ficus KS 0460, namely, for the Mn-transporter nramp, and proteins DdrF, DdrJ and DdrK, all of which are also missing in Deinococcus deserti. Otherwise, D. ficus KS 0460 exemplifies the Deinococcus lineage.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.