Menu
September 22, 2019

Bacterial artificial chromosome clones randomly selected for sequencing reveal genomic differences between soybean cultivars

This study pioneered the use of multiple technologies to combine the bacterial artificial chromosome (BAC) pooling strategy with high-throughput next- and third-generation sequencing technologies to analyse genomic difference. To understand the genetic background of the Chinese soybean cultivar N23601, we built a BAC library and sequenced 10 randomly selected clones followed by de novo assembly. Comparative analysis was conducted against the reference genome of Glycine max var. Williams 82 (2.0). Therefore, our result is an assessment of the reference genome. Our results revealed that 3517 single nucleotide polymorphisms (SNPs) and 662 insertion–deletions (InDels) occurred in ~1.2 Mb of the genomic region and that four of the 10 BAC clones contained 15 large structural variations (72?887?bp) compared with the reference genome. Gene annotation of the reference genome showed that Glyma.18g181000 was missing from the corresponding position of the 10 BAC clones. Additionally, there may be a problem with the assembly of some positions of the reference genome. Several gap regions in the reference genome could be supplemented by using the complete sequence of the 10 BAC clones. We believe that accurate and complete BAC sequence is a valuable resource that contributes to the completeness of the reference genome.


September 22, 2019

Analysis of the Aedes albopictus C6/36 genome provides insight into cell line utility for viral propagation.

The 50-year-old Aedes albopictus C6/36 cell line is a resource for the detection, amplification, and analysis of mosquito-borne viruses including Zika, dengue, and chikungunya. The cell line is derived from an unknown number of larvae from an unspecified strain of Aedes albopictus mosquitoes. Toward improved utility of the cell line for research in virus transmission, we present an annotated assembly of the C6/36 genome.The C6/36 genome assembly has the largest contig N50 (3.3 Mbp) of any mosquito assembly, presents the sequences of both haplotypes for most of the diploid genome, reveals independent null mutations in both alleles of the Dicer locus, and indicates a male-specific genome. Gene annotation was computed with publicly available mosquito transcript sequences. Gene expression data from cell line RNA sequence identified enrichment of growth-related pathways and conspicuous deficiency in aquaporins and inward rectifier K+ channels. As a test of utility, RNA sequence data from Zika-infected cells were mapped to the C6/36 genome and transcriptome assemblies. Host subtraction reduced the data set by 89%, enabling faster characterization of nonhost reads.The C6/36 genome sequence and annotation should enable additional uses of the cell line to study arbovirus vector interactions and interventions aimed at restricting the spread of human disease.


September 22, 2019

Repeat-driven generation of antigenic diversity in a major human pathogen, Trypanosoma cruzi

Trypanosoma cruzi, a zoonotic kinetoplastid protozoan with a complex genome, is the causative agent of American trypanosomiasis (Chagas disease). The parasite uses a highly diverse repertoire of surface molecules, with roles in cell invasion, immune evasion and pathogenesis. Thus far, the genomic regions containing these genes have been impossible to resolve and it has been impossible to study the structure and function of the several thousand repetitive genes encoding the surface molecules of the parasite. We here present an improved genome assembly of a T. cruzi clade I (TcI) strain using high coverage PacBio single molecule sequencing, together with Illumina sequencing of 34 T. cruzi TcI isolates and clones from different geographic locations, sample sources and clinical outcomes. Resolution of the surface molecule gene structure reveals an unusual duality in the organisation of the parasite genome, a core genomic region syntenous with related protozoa flanked by unique and highly plastic subtelomeric regions encoding surface antigens. The presence of abundant interspersed retrotransposons in the subtelomeres suggests that these elements are involved in a recombination mechanism for the generation of antigenic variation and evasion of the host immune response. The comparative genomic analysis of the cohort of TcI strains revealed multiple cases of such recombination events involving surface molecule genes and has provided new insights into T. cruzi population structure.


September 22, 2019

Comparative genomics of the wheat fungal pathogen Pyrenophora tritici-repentis reveals chromosomal variations and genome plasticity.

Pyrenophora tritici-repentis (Ptr) is a necrotrophic fungal pathogen that causes the major wheat disease, tan spot. We set out to provide essential genomics-based resources in order to better understand the pathogenicity mechanisms of this important pathogen.Here, we present eight new Ptr isolate genomes, assembled and annotated; representing races 1, 2 and 5, and a new race. We report a high quality Ptr reference genome, sequenced by PacBio technology with Illumina paired-end data support and optical mapping. An estimated 98% of the genome coverage was mapped to 10 chromosomal groups, using a two-enzyme hybrid approach. The final reference genome was 40.9 Mb and contained a total of 13,797 annotated genes, supported by transcriptomic and proteogenomics data sets.Whole genome comparative analysis revealed major chromosomal segmental rearrangements and fusions, highlighting intraspecific genome plasticity in this species. Furthermore, the Ptr race classification was not supported at the whole genome level, as phylogenetic analysis did not cluster the ToxA producing isolates. This expansion of available Ptr genomics resources will directly facilitate research aimed at controlling tan spot disease.


September 22, 2019

Whole genome sequence of an edible and potential medicinal fungus, Cordyceps guangdongensis.

Cordyceps guangdongensis is an edible fungus which was approved as a novel food by the Chinese Ministry of Public Health in 2013. It also has a broad prospect of application in pharmaceutical industries, with many medicinal activities. In this study, the whole genome of C. guangdongensis GD15, a single spore isolate from a wild strain, was sequenced and assembled with Illumina and PacBio sequencing technology. The generated genome is 29.05 Mb in size, comprising nine scaffolds with an average GC content of 57.01%. It is predicted to contain a total of 9150 protein-coding genes. Sequence identification and comparative analysis indicated that the assembled scaffolds contained two complete chromosomes and four single-end chromosomes, showing a high level assembly. Gene annotation revealed a diversity of transposons that could contribute to the genome size and evolution. Besides, approximately 15.57% and 12.01% genes involved in metabolic processes were annotated by KEGG and COG respectively. Genes belonging to CAZymes accounted for 3.15% of the total genes. In addition, 435 transcription factors, involved in various biological processes, were identified. Among the identified transcription factors, the fungal transcription regulatory proteins (18.39%) and fungal-specific transcription factors (19.77%) represented the two largest classes of transcription factors. This genomic resource provided a new insight into better understanding the relevance of phenotypic characters and genetic mechanisms in C. guangdongensis. Copyright © 2018 Zhang et al.


September 22, 2019

Insights into platypus population structure and history from whole-genome sequencing.

The platypus is an egg-laying mammal which, alongside the echidna, occupies a unique place in the mammalian phylogenetic tree. Despite widespread interest in its unusual biology, little is known about its population structure or recent evolutionary history. To provide new insights into the dispersal and demographic history of this iconic species, we sequenced the genomes of 57 platypuses from across the whole species range in eastern mainland Australia and Tasmania. Using a highly improved reference genome, we called over 6.7?M SNPs, providing an informative genetic data set for population analyses. Our results show very strong population structure in the platypus, with our sampling locations corresponding to discrete groupings between which there is no evidence for recent gene flow. Genome-wide data allowed us to establish that 28 of the 57 sampled individuals had at least a third-degree relative among other samples from the same river, often taken at different times. Taking advantage of a sampled family quartet, we estimated the de novo mutation rate in the platypus at 7.0?×?10-9/bp/generation (95% CI 4.1?×?10-9-1.2?×?10-8/bp/generation). We estimated effective population sizes of ancestral populations and haplotype sharing between current groupings, and found evidence for bottlenecks and long-term population decline in multiple regions, and early divergence between populations in different regions. This study demonstrates the power of whole-genome sequencing for studying natural populations of an evolutionarily important species.


September 22, 2019

Double insertion of transposable elements provides a substrate for the evolution of satellite DNA.

Eukaryotic genomes are replete with repeated sequences in the form of transposable elements (TEs) dispersed across the genome or as satellite arrays, large stretches of tandemly repeated sequences. Many satellites clearly originated as TEs, but it is unclear how mobile genetic parasites can transform into megabase-sized tandem arrays. Comprehensive population genomic sampling is needed to determine the frequency and generative mechanisms of tandem TEs, at all stages from their initial formation to their subsequent expansion and maintenance as satellites. The best available population resources, short-read DNA sequences, are often considered to be of limited utility for analyzing repetitive DNA due to the challenge of mapping individual repeats to unique genomic locations. Here we develop a new pipeline called ConTExt that demonstrates that paired-end Illumina data can be successfully leveraged to identify a wide range of structural variation within repetitive sequence, including tandem elements. By analyzing 85 genomes from five populations of Drosophila melanogaster, we discover that TEs commonly form tandem dimers. Our results further suggest that insertion site preference is the major mechanism by which dimers arise and that, consequently, dimers form rapidly during periods of active transposition. This abundance of TE dimers has the potential to provide source material for future expansion into satellite arrays, and we discover one such copy number expansion of the DNA transposon hobo to approximately 16 tandem copies in a single line. The very process that defines TEs-transposition-thus regularly generates sequences from which new satellites can arise.© 2018 McGurk and Barbash; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and its application for pineapple LTR retrotransposons diversity and dynamics.

One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.


September 22, 2019

A transposable element annotation pipeline and expression analysis reveal potentially active elements in the microalga Tisochrysis lutea.

Transposable elements (TEs) are mobile DNA sequences known as drivers of genome evolution. Their impacts have been widely studied in animals, plants and insects, but little is known about them in microalgae. In a previous study, we compared the genetic polymorphisms between strains of the haptophyte microalga Tisochrysis lutea and suggested the involvement of active autonomous TEs in their genome evolution.To identify potentially autonomous TEs, we designed a pipeline named PiRATE (Pipeline to Retrieve and Annotate Transposable Elements, download: https://doi.org/10.17882/51795 ), and conducted an accurate TE annotation on a new genome assembly of T. lutea. PiRATE is composed of detection, classification and annotation steps. Its detection step combines multiple, existing analysis packages representing all major approaches for TE detection and its classification step was optimized for microalgal genomes. The efficiency of the detection and classification steps was evaluated with data on the model species Arabidopsis thaliana. PiRATE detected 81% of the TE families of A. thaliana and correctly classified 75% of them. We applied PiRATE to T. lutea genomic data and established that its genome contains 15.89% Class I and 4.95% Class II TEs. In these, 3.79 and 17.05% correspond to potentially autonomous and non-autonomous TEs, respectively. Annotation data was combined with transcriptomic and proteomic data to identify potentially active autonomous TEs. We identified 17 expressed TE families and, among these, a TIR/Mariner and a TIR/hAT family were able to synthesize their transposase. Both these TE families were among the three highest expressed genes in a previous transcriptomic study and are composed of highly similar copies throughout the genome of T. lutea. This sum of evidence reveals that both these TE families could be capable of transposing or triggering the transposition of potential related MITE elements.This manuscript provides an example of a de novo transposable element annotation of a non-model organism characterized by a fragmented genome assembly and belonging to a poorly studied phylum at genomic level. Integration of multi-omics data enabled the discovery of potential mobile TEs and opens the way for new discoveries on the role of these repeated elements in genomic evolution of microalgae.


September 22, 2019

Signatures of host specialization and a recent transposable element burst in the dynamic one-speed genome of the fungal barley powdery mildew pathogen.

Powdery mildews are biotrophic pathogenic fungi infecting a number of economically important plants. The grass powdery mildew, Blumeria graminis, has become a model organism to study host specialization of obligate biotrophic fungal pathogens. We resolved the large-scale genomic architecture of B. graminis forma specialis hordei (Bgh) to explore the potential influence of its genome organization on the co-evolutionary process with its host plant, barley (Hordeum vulgare).The near-chromosome level assemblies of the Bgh reference isolate DH14 and one of the most diversified isolates, RACE1, enabled a comparative analysis of these haploid genomes, which are highly enriched with transposable elements (TEs). We found largely retained genome synteny and gene repertoires, yet detected copy number variation (CNV) of secretion signal peptide-containing protein-coding genes (SPs) and locally disrupted synteny blocks. Genes coding for sequence-related SPs are often locally clustered, but neither the SPs nor the TEs reside preferentially in genomic regions with unique features. Extended comparative analysis with different host-specific B. graminis formae speciales revealed the existence of a core suite of SPs, but also isolate-specific SP sets as well as congruence of SP CNV and phylogenetic relationship. We further detected evidence for a recent, lineage-specific expansion of TEs in the Bgh genome.The characteristics of the Bgh genome (largely retained synteny, CNV of SP genes, recently proliferated TEs and a lack of significant compartmentalization) are consistent with a “one-speed” genome that differs in its architecture and (co-)evolutionary pattern from the “two-speed” genomes reported for several other filamentous phytopathogens.


September 22, 2019

Identification of a leucine-rich repeat receptor-like serine/threonine-protein kinase as a candidate gene for Rvi12 (Vb)-based apple scab resistance

Apple scab caused by Venturia inaequalis is the most important fungal disease of apples (Malus × domestica). Currently, the disease is controlled by up to 15 fungicide applications to the crop per year. Resistant apple cultivars will help promote the sustainable control of scab in commercial orchards. The breakdown of the Rvi6 (Vf) major-gene based resistance, the most used resistance gene in apple breeding, prompted the identification and characterization of new scab resistance genes. By using a large segregating population, the Rvi12 scab resistance gene was previously mapped to a genetic location flanked by molecular markers SNP_23.599 and SNP_24.482. Starting from these markers, utilizing chromosome walking of a Hansen’s baccata #2 (HB2) BAC-library; a single BAC clone spanning the Rvi12 interval was identified. Following Pacific Biosciences (PacBio) RS II sequencing and the use of the hierarchical genome assembly process (HGAP) assembly of the BAC clone sequence, the Rvi12 resistance locus was localized to a 62.3-kb genomic region. Gene prediction and in silico characterization identified a single candidate resistance gene. The gene, named here as Rvi12_Cd5, belongs to the LRR receptor-like serine/threonine-protein kinase family. In silico comparison of the resistance allele from HB2 and the susceptible allele from Golden Delicious (GD) identified the presence of an additional intron in the HB2 allele. Conserved domain analysis identified the presence of four additional LRR motifs in the susceptible allele compared to the resistance allele. The constitutive expression of Rvi12_Cd5 in HB2, together with its structural similarity to known resistance genes, makes it the most likely candidate for Rvi12 scab resistance in apple.


September 22, 2019

Amplification and adaptation of centromeric repeats in polyploid switchgrass species.

Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats from a single satellite repeat family. Why centromeres are dominated by a single satellite repeat and how the satellite repeats originate and evolve are among the most intriguing and long-standing questions in centromere biology. We identified eight satellite repeats in the centromeres of tetraploid switchgrass (Panicum virgatum). Seven repeats showed characteristics associated with classical centromeric repeats with monomeric lengths ranging from 166 to 187 bp. Interestingly, these repeats share an 80-bp DNA motif. We demonstrate that this 80-bp motif may dictate translational and rotational phasing of the centromeric repeats with the cenH3 nucleosomes. The sequence of the last centromeric repeat, Pv156, is identical to the 5S ribosomal RNA genes. We demonstrate that a 5S ribosomal RNA gene array was recruited to be the functional centromere for one of the switchgrass chromosomes. Our findings reveal that certain types of satellite repeats, which are associated with unique sequence features and are composed of monomers in mono-nucleosomal length, are favorable for centromeres. Centromeric repeats may undergo dynamic amplification and adaptation before the centromeres in the same species become dominated by the best adapted satellite repeat.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.


September 22, 2019

Extensive exchange of transposable elements in the Drosophila pseudoobscura group.

As species diverge, so does their transposable element (TE) content. Within a genome, TE families may eventually become dormant due to host-silencing mechanisms, natural selection and the accumulation of inactive copies. The transmission of active copies from a TE families, both vertically and horizontally between species, can allow TEs to escape inactivation if it occurs often enough, as it may allow TEs to temporarily escape silencing in a new host. Thus, the contribution of horizontal exchange to TE persistence has been of increasing interest.Here, we annotated TEs in five species with sequenced genomes from the D. pseudoobscura species group, and curated a set of TE families found in these species. We found that, compared to host genes, many TE families showed lower neutral divergence between species, consistent with recent transmission of TEs between species. Despite these transfers, there are differences in the TE content between species in the group.The TE content is highly dynamic in the D. pseudoobscura species group, frequently transferring between species, keeping TEs active. This result highlights how frequently transposable elements are transmitted between sympatric species and, despite these transfers, how rapidly species TE content can diverge.


September 22, 2019

Diversity and evolution of the emerging Pandoraviridae family.

With DNA genomes reaching 2.5?Mb packed in particles of bacterium-like shape and dimension, the first two Acanthamoeba-infecting pandoraviruses remained up to now the most complex viruses since their discovery in 2013. Our isolation of three new strains from distant locations and environments is now used to perform the first comparative genomics analysis of the emerging worldwide-distributed Pandoraviridae family. Thorough annotation of the genomes combining transcriptomic, proteomic, and bioinformatic analyses reveals many non-coding transcripts and significantly reduces the former set of predicted protein-coding genes. Here we show that the pandoraviruses exhibit an open pan-genome, the enormous size of which is not adequately explained by gene duplications or horizontal transfers. As most of the strain-specific genes have no extant homolog and exhibit statistical features comparable to intergenic regions, we suggest that de novo gene creation could contribute to the evolution of the giant pandoravirus genomes.


September 22, 2019

Footprints of parasitism in the genome of the parasitic flowering plant Cuscuta campestris.

A parasitic lifestyle, where plants procure some or all of their nutrients from other living plants, has evolved independently in many dicotyledonous plant families and is a major threat for agriculture globally. Nevertheless, no genome sequence of a parasitic plant has been reported to date. Here we describe the genome sequence of the parasitic field dodder, Cuscuta campestris. The genome contains signatures of a fairly recent whole-genome duplication and lacks genes for pathways superfluous to a parasitic lifestyle. Specifically, genes needed for high photosynthetic activity are lost, explaining the low photosynthesis rates displayed by the parasite. Moreover, several genes involved in nutrient uptake processes from the soil are lost. On the other hand, evidence for horizontal gene transfer by way of genomic DNA integration from the parasite’s hosts is found. We conclude that the parasitic lifestyle has left characteristic footprints in the C. campestris genome.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.