pacbio data Archives - Page 13 of 21

July 7, 2019

Lost in plasmids: next generation sequencing and the complex genome of the tick-borne pathogen Borrelia burgdorferi.

Borrelia (B.) burgdorferi sensu lato, including the tick-transmitted agents of human Lyme borreliosis, have particularly complex genomes, consisting of a linear main chromosome and numerous linear and circular plasmids. The number and structure of plasmids is variable even in strains within a single genospecies. Genes on these plasmids are known to play essential roles in virulence and pathogenicity as well as host and vector associations. For this reason, it is essential to explore methods for rapid and reliable characterisation of molecular level changes on plasmids. In this study we used three strains: a low passage isolate of B. burgdorferi sensu stricto strain B31(-NRZ) and two closely related strains (PAli and PAbe) that were isolated from human patients. Sequences of these strains were compared to the previously sequenced reference strain B31 (available in GenBank) to obtain proof-of-principle information on the suitability of next generation sequencing (NGS) library construction and sequencing methods on the assembly of bacterial plasmids. We tested the effectiveness of different short read assemblers on Illumina sequences, and of long read generation methods on sequence data from Pacific Bioscience single-molecule real-time (SMRT) and nanopore (Oxford Nanopore Technologies) sequencing technology.Inclusion of mate pair library reads improved the assembly in some plasmids as did prior enrichment of plasmids. While cp32 plasmids remained refractory to assembly using only short reads they were effectively assembled by long read sequencing methods. The long read SMRT and nanopore sequences came, however, at the cost of indels (insertions or deletions) appearing in an unpredictable manner. Using long and short read technologies together allowed us to show that the three B. burgdorferi s.s. strains investigated here, whilst having similar plasmid structures to each other (apart from fusion of cp32 plasmids), differed significantly from the reference strain B31-GB, especially in the case of cp32 plasmids.Short read methods are sufficient to assemble the main chromosome and many of the plasmids in B. burgdorferi. However, a combination of short and long read sequencing methods is essential for proper assembly of all plasmids including cp32 and thus, for gaining an understanding of host- or vector adaptations. An important conclusion from our work is that the evolution of Borrelia plasmids appears to be dynamic. This has important implications for the development of useful research strategies to monitor the risk of Lyme disease occurrence and how to medically manage it.

July 7, 2019

Toolkit for automated and rapid discovery of structural variants.

Structural variations (SV) are broadly defined as genomic alterations that affect > 50 bp of DNA, which are shown to have significant effect on evolution and disease. The advent of high throughput sequencing (HTS) technologies and the ability to perform whole genome sequencing (WGS), makes it feasible to study these variants in depth. However, discovery of all forms of SV using WGS has proven to be challenging as the short reads produced by the predominant HTS platforms (<200bp for current technologies) and the fact that most genomes include large amounts of repeats make it very difficult to unambiguously map and accurately characterize such variants. Furthermore, existing tools for SV discovery are primarily developed for only a few of the SV types, which may have conflicting sequence signatures (i.e. read pairs, read depth, split reads) with other, untargeted SV classes. Here we are introduce a new framework, Tardis, which combines multiple read signatures into a single package to characterize most SV types simultaneously, while preventing such conflicts. Tardis also has a modular structure that makes it easy to extend for the discovery of additional forms of SV. Copyright © 2017. Published by Elsevier Inc.

July 7, 2019

Analysis of complete genome sequence and major surface antigens of Neorickettsia helminthoeca, causative agent of salmon poisoning disease.

Neorickettsia helminthoeca, a type species of the genus Neorickettsia, is an endosymbiont of digenetic trematodes of veterinary importance. Upon ingestion of salmonid fish parasitized with infected trematodes, canids develop salmon poisoning disease (SPD), an acute febrile illness that is particularly severe and often fatal in dogs without adequate treatment. We determined and analysed the complete genome sequence of N. helminthoeca: a single small circular chromosome of 884 232 bp encoding 774 potential proteins. N. helminthoeca is unable to synthesize lipopolysaccharides and most amino acids, but is capable of synthesizing vitamins, cofactors, nucleotides and bacterioferritin. N. helminthoeca is, however, distinct from majority of the family Anaplasmataceae to which it belongs, as it encodes nearly all enzymes required for peptidoglycan biosynthesis, suggesting its structural hardiness and inflammatory potential. Using sera from dogs that were experimentally infected by feeding with parasitized fish or naturally infected in southern California, Western blot analysis revealed that among five predicted N. helminthoeca outer membrane proteins, P51 and strain-variable surface antigen were uniformly recognized. Our finding will help understanding pathogenesis, prevalence of N. helminthoeca infection among trematodes, canids and potentially other animals in nature to develop effective SPD diagnostic and preventive measures. Recent progresses in large-scale genome sequencing have been uncovering broad distribution of Neorickettsia spp., the comparative genomics will facilitate understanding of biology and the natural history of these elusive environmental bacteria.© 2017 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

July 7, 2019

Genetic analysis of Neisseria meningitidis sequence type 7 serogroup X originating from serogroup A.

Neisseria meningitidis causes meningococcal disease, often resulting in fulminant meningitis, sepsis, and death. Vaccination programs have been developed to prevent infection of this pathogen, but serogroup replacement is a problem. Capsular switching has been an important survival mechanism for N. meningitidis, allowing the organism to evolve in the present vaccine era. However, related mechanisms have not been completely elucidated. Genetic analysis of capsular switching between diverse serogroups would help further our understanding of this pathogen. In this study, we analyzed the genetic characteristics of the sequence type 7 (ST-7) serogroup X strain that was predicted to arise from ST-7 serogroup A at the genomic level. By comparing the genomic structures and sequences, ST-7 serogroup X was closest to ST-7 serogroup A, whereas eight probable recombination regions, including the capsular gene locus, were identified. This indicated that serogroup X originated from serogroup A by recombination leading to capsular switching. The recombination involved approximately 8,540 bp from the end of the ctrC gene to the middle of the galE gene. There were more recombination regions and strain-specific single-nucleotide polymorphisms in serogroup X than in serogroup A genomes. However, no specific gene was found for each serogroup except those in the capsule gene locus. Copyright © 2017 American Society for Microbiology.

July 7, 2019

No evidence for maintenance of a sympatric Heliconius species barrier by chromosomal inversions.

Mechanisms that suppress recombination are known to help maintain species barriers by preventing the breakup of coadapted gene combinations. The sympatric butterfly species Heliconius melpomene and Heliconius cydno are separated by many strong barriers, but the species still hybridize infrequently in the wild, and around 40% of the genome is influenced by introgression. We tested the hypothesis that genetic barriers between the species are maintained by inversions or other mechanisms that reduce between-species recombination rate. We constructed fine-scale recombination maps for Panamanian populations of both species and their hybrids to directly measure recombination rate within and between species, and generated long sequence reads to detect inversions. We find no evidence for a systematic reduction in recombination rates in F1 hybrids, and also no evidence for inversions longer than 50 kb that might be involved in generating or maintaining species barriers. This suggests that mechanisms leading to global or local reduction in recombination do not play a significant role in the maintenance of species barriers between H. melpomene and H. cydno.

July 7, 2019

De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms.

Long-read sequencing technologies such as Pacific Biosciences and Oxford Nanopore MinION are capable of producing long sequencing reads with average fragment lengths of over 10,000 base-pairs and maximum lengths reaching 100,000 base- pairs. Compared with short reads, the assemblies obtained from long-read sequencing platforms have much higher contig continuity and genome completeness as long fragments are able to extend paths into problematic or repetitive regions. Many successful assembly applications of the Pacific Biosciences technology have been reported ranging from small bacterial genomes to large plant and animal genomes. Recently, genome assemblies using Oxford Nanopore MinION data have attracted much attention due to the portability and low cost of this novel sequencing instrument. In this paper, we re-sequenced a well characterized genome, the Saccharomyces cerevisiae S288C strain using three different platforms: MinION, PacBio and MiSeq. We present a comprehensive metric comparison of assemblies generated by various pipelines and discuss how the platform associated data characteristics affect the assembly quality. With a given read depth of 31X, the assemblies from both Pacific Biosciences and Oxford Nanopore MinION show excellent continuity and completeness for the 16 nuclear chromosomes, but not for the mitochondrial genome, whose reconstruction still represents a significant challenge.

July 7, 2019

Higher-order organisation of extremely amplified, potentially functional and massively methylated 5S rDNA in European pikes (Esox sp.).

Pikes represent an important genus (Esox) harbouring a pre-duplication karyotype (2n?=?2x?=?50) of economically important salmonid pseudopolyploids. Here, we have characterized the 5S ribosomal RNA genes (rDNA) in Esox lucius and its closely related E. cisalpinus using cytogenetic, molecular and genomic approaches. Intragenomic homogeneity and copy number estimation was carried out using Illumina reads. The higher-order structure of rDNA arrays was investigated by the analysis of long PacBio reads. Position of loci on chromosomes was determined by FISH. DNA methylation was analysed by methylation-sensitive restriction enzymes.The 5S rDNA loci occupy exclusively (peri)centromeric regions on 30-38 acrocentric chromosomes in both E. lucius and E. cisalpinus. The large number of loci is accompanied by extreme amplification of genes (>20,000 copies), which is to the best of our knowledge one of the highest copy number of rRNA genes in animals ever reported. Conserved secondary structures of predicted 5S rRNAs indicate that most of the amplified genes are potentially functional. Only few SNPs were found in genic regions indicating their high homogeneity while intergenic spacers were more heterogeneous and several families were identified. Analysis of 10-30 kb-long molecules sequenced by the PacBio technology (containing about 40% of total 5S rDNA) revealed that the vast majority (96%) of genes are organised in large several kilobase-long blocks. Dispersed genes or short tandems were less common (4%). The adjacent 5S blocks were directly linked, separated by intervening DNA and even inverted. The 5S units differing in the intergenic spacers formed both homogeneous and heterogeneous (mixed) blocks indicating variable degree of homogenisation between the loci. Both E. lucius and E. cisalpinus 5S rDNA was heavily methylated at CG dinucleotides.Extreme amplification of 5S rRNA genes in the Esox genome occurred in the absence of significant pseudogenisation suggesting its recent origin and/or intensive homogenisation processes. The dense methylation of units indicates that powerful epigenetic mechanisms have evolved in this group of fish to silence amplified genes. We discuss how the higher-order repeat structures impact on homogenisation of 5S rDNA in the genome.

July 7, 2019

Coping with living in the soil: the genome of the parthenogenetic springtail Folsomia candida.

Folsomia candida is a model in soil biology, belonging to the family of Isotomidae, subclass Collembola. It reproduces parthenogenetically in the presence of Wolbachia, and exhibits remarkable physiological adaptations to stress. To better understand these features and adaptations to life in the soil, we studied its genome in the context of its parthenogenetic lifestyle.We applied Pacific Bioscience sequencing and assembly to generate a reference genome for F. candida of 221.7 Mbp, comprising only 162 scaffolds. The complete genome of its endosymbiont Wolbachia, was also assembled and turned out to be the largest strain identified so far. Substantial gene family expansions and lineage-specific gene clusters were linked to stress response. A large number of genes (809) were acquired by horizontal gene transfer. A substantial fraction of these genes are involved in lignocellulose degradation. Also, the presence of genes involved in antibiotic biosynthesis was confirmed. Intra-genomic rearrangements of collinear gene clusters were observed, of which 11 were organized as palindromes. The Hox gene cluster of F. candida showed major rearrangements compared to arthropod consensus cluster, resulting in a disorganized cluster.The expansion of stress response gene families suggests that stress defense was important to facilitate colonization of soils. The large number of HGT genes related to lignocellulose degradation could be beneficial to unlock carbohydrate sources in soil, especially those contained in decaying plant and fungal organic matter. Intra- as well as inter-scaffold duplications of gene clusters may be a consequence of its parthenogenetic lifestyle. This high quality genome will be instrumental for evolutionary biologists investigating deep phylogenetic lineages among arthropods and will provide the basis for a more mechanistic understanding in soil ecology and ecotoxicology.

July 7, 2019

Discovering and sequencing new plant viral genomes by next-generation sequencing: description of a practical pipeline

Small-scale sequencing has improved substantially in recent decades, culminating in the development of next-generation sequencing (NGS) technologies. Modern NGS methods have helped the discovery of many new plant viruses. Nevertheless, there is still a need to establish solid assembly pipelines targeting small genomes characterised by low identities to known viral sequences. Here, we describe and discuss the fundamental steps required for discovering and sequencing new plant viral genomes by NGS. A practical pipeline and standard alternative tools used in NGS analysis are presented.

July 7, 2019

Complete genome sequences of three representative Mycobacterium tuberculosis Beijing family strains belonging to distinct genotype clusters in Hanoi, Vietnam, during 2007 to 2009.

We present here three complete genome sequences of Mycobacterium tuberculosis Beijing family strains isolated in Hanoi, Vietnam. These three strains were selected from major genotypic clusters (15-MIRU-VNTR) identified in a previous population-based study. We emphasize their importance and potential as reference strains in this Asian region. Copyright © 2017 Wada et al.

July 7, 2019

Complete genome sequence of the olive-infecting strain Xylella fastidiosa subsp. pauca De Donno.

We report here the complete and annotated genome sequence of the plant-pathogenic bacterium Xylella fastidiosa subsp. pauca strain De Donno. This strain was recovered from an olive tree severely affected by olive quick decline syndrome (OQDS), a devastating olive disease associated with X. fastidiosa infections in susceptible olive cultivars. Copyright © 2017 Giampetruzzi et al.

July 7, 2019

Complete genome sequence of a Mycobacterium tuberculosis strain belonging to the East African-Indian family in the Indo-Oceanic lineage, isolated in Hanoi, Vietnam.

The East African-Indian (EAI) family of Mycobacterium tuberculosis is an endemic group mainly observed in Southeast Asia. Here, we report the complete genome sequence of an M. tuberculosis strain isolated as a member of the EAI family in Hanoi, Vietnam, a country with a high incidence of tuberculosis. Copyright © 2017 Wada et al.

July 7, 2019

Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements.We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements.Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ~22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements.We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats.© 2017 Botanical Society of America.

July 7, 2019

Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters.

Trichoderma reesei (Ascomycota, Pezizomycotina) QM6a is a model fungus for a broad spectrum of physiological phenomena, including plant cell wall degradation, industrial production of enzymes, light responses, conidiation, sexual development, polyketide biosynthesis, and plant-fungal interactions. The genomes of QM6a and its high enzyme-producing mutants have been sequenced by second-generation-sequencing methods and are publicly available from the Joint Genome Institute. While these genome sequences have offered useful information for genomic and transcriptomic studies, their limitations and especially their short read lengths make them poorly suited for some particular biological problems, including assembly, genome-wide determination of chromosome architecture, and genetic modification or engineering.We integrated Pacific Biosciences and Illumina sequencing platforms for the highest-quality genome assembly yet achieved, revealing seven telomere-to-telomere chromosomes (34,922,528 bp; 10877 genes) with 1630 newly predicted genes and >1.5 Mb of new sequences. Most new sequences are located on AT-rich blocks, including 7 centromeres, 14 subtelomeres, and 2329 interspersed AT-rich blocks. The seven QM6a centromeres separately consist of 24 conserved repeats and 37 putative centromere-encoded genes. These findings open up a new perspective for future centromere and chromosome architecture studies. Next, we demonstrate that sexual crossing readily induced cytosine-to-thymine point mutations on both tandem and unlinked duplicated sequences. We also show by bioinformatic analysis that T. reesei has evolved a robust repeat-induced point mutation (RIP) system to accumulate AT-rich sequences, with longer AT-rich blocks having more RIP mutations. The widespread distribution of AT-rich blocks correlates genome-wide partitions with gene clusters, explaining why clustering of genes has been reported to not influence gene expression in T. reesei.Compartmentation of ancestral gene clusters by AT-rich blocks might promote flexibilities that are evolutionarily advantageous in this fungus’ soil habitats and other natural environments. Our analyses, together with the complete genome sequence, provide a better blueprint for biotechnological and industrial applications.

July 7, 2019

Draft nuclear genome sequence of the liquid hydrocarbon–accumulating green microalga Botryococcus braunii race B (Showa).

Botryococcus braunii has long been known as a prodigious producer of liquid hydrocarbon oils that can be converted into combustion engine fuels. This draft genome for the B race of B. braunii will allow researchers to unravel important hydrocarbon biosynthetic pathways and identify possible regulatory networks controlling this unusual metabolism. Copyright © 2017 Browne et al.

Auto Tag: pacbio data

Lost in plasmids: next generation sequencing and the complex genome of the tick-borne pathogen Borrelia burgdorferi.

Toolkit for automated and rapid discovery of structural variants.

Analysis of complete genome sequence and major surface antigens of Neorickettsia helminthoeca, causative agent of salmon poisoning disease.

Genetic analysis of Neisseria meningitidis sequence type 7 serogroup X originating from serogroup A.

No evidence for maintenance of a sympatric Heliconius species barrier by chromosomal inversions.

De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms.

Higher-order organisation of extremely amplified, potentially functional and massively methylated 5S rDNA in European pikes (Esox sp.).

Coping with living in the soil: the genome of the parthenogenetic springtail Folsomia candida.

Discovering and sequencing new plant viral genomes by next-generation sequencing: description of a practical pipeline

Complete genome sequences of three representative Mycobacterium tuberculosis Beijing family strains belonging to distinct genotype clusters in Hanoi, Vietnam, during 2007 to 2009.

Complete genome sequence of the olive-infecting strain Xylella fastidiosa subsp. pauca De Donno.

Complete genome sequence of a Mycobacterium tuberculosis strain belonging to the East African-Indian family in the Indo-Oceanic lineage, isolated in Hanoi, Vietnam.

Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure

Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters.

Draft nuclear genome sequence of the liquid hydrocarbon–accumulating green microalga Botryococcus braunii race B (Showa).

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert