April 21, 2020  |  

Resequencing the Genome of Malassezia restricta Strain KCTC 27527.

The draft genome sequence of Malassezia restricta KCTC 27527, a clinical isolate from a patient with dandruff, was previously reported. Using the PacBio Sequel platform, we completed and reannotated the genome of M. restricta KCTC 27527 for a better understanding of the genome of this fungus.Copyright © 2019 Cho et al.


April 21, 2020  |  

Genes of the pig, Sus scrofa, reconstructed with EvidentialGene.

The pig is a well-studied model animal of biomedical and agricultural importance. Genes of this species, Sus scrofa, are known from experiments and predictions, and collected at the NCBI reference sequence database section. Gene reconstruction from transcribed gene evidence of RNA-seq now can accurately and completely reproduce the biological gene sets of animals and plants. Such a gene set for the pig is reported here, including human orthologs missing from current NCBI and Ensembl reference pig gene sets, additional alternate transcripts, and other improvements. Methodology for accurate and complete gene set reconstruction from RNA is used: the automated SRA2Genes pipeline of EvidentialGene project.


April 21, 2020  |  

Immunogenetic factors driving formation of ultralong VH CDR3 in Bos taurus antibodies.

The antibody repertoire of Bos taurus is characterized by a subset of variable heavy (VH) chain regions with ultralong third complementarity determining regions (CDR3) which, compared to other species, can provide a potent response to challenging antigens like HIV env. These unusual CDR3 can range to over seventy highly diverse amino acids in length and form unique ß-ribbon ‘stalk’ and disulfide bonded ‘knob’ structures, far from the typical antigen binding site. The genetic components and processes for forming these unusual cattle antibody VH CDR3 are not well understood. Here we analyze sequences of Bos taurus antibody VH domains and find that the subset with ultralong CDR3 exclusively uses a single variable gene, IGHV1-7 (VHBUL) rearranged to the longest diversity gene, IGHD8-2. An eight nucleotide duplication at the 3′ end of IGHV1-7 encodes a longer V-region producing an extended F ß-strand that contributes to the stalk in a rearranged CDR3. A low amino acid variability was observed in CDR1 and CDR2, suggesting that antigen binding for this subset most likely only depends on the CDR3. Importantly a novel, potentially AID mediated, deletional diversification mechanism of the B. taurus VH ultralong CDR3 knob was discovered, in which interior codons of the IGHD8-2 region are removed while maintaining integral structural components of the knob and descending strand of the stalk in place. These deletions serve to further diversify cysteine positions, and thus disulfide bonded loops. Hence, both germline and somatic genetic factors and processes appear to be involved in diversification of this structurally unusual cattle VH ultralong CDR3 repertoire.


April 21, 2020  |  

Potential of TLR-gene diversity in Czech indigenous cattle for resistance breeding as revealed by hybrid sequencing

A production herd of Czech Simmental cattle (Czech Red Pied, CRP), the conserved subpopulation of this breed, and the ancient local breed Czech Red cattle (CR) were screened for diversity in the antibacterial toll-like receptors (TLRs), which are members of the innate immune system. Polymerase chain reaction (PCR) amplicons of TLR1, TLR2, TLR4, TLR5, and TLR6 from pooled DNA samples were sequenced with PacBio technology, with 3–5×?coverage per gene per animal. To increase the reliability of variant detection, the gDNA pools were sequenced in parallel with the Illumina X-ten platform at low coverage (60× per gene). The diversity in conserved CRP and CR was similar to the diversity in conserved and modern CRP, representing 76.4?% and 70.9?% of its variants, respectively. Sixty-eight (54.4?%) polymorphisms in the five TLR genes were shared by the two breeds, whereas 38 (30.4?%) were specific to the production herd of CRP; 4 (3.2?%) were specific to the broad CRP population; 7 (5.6?%) were present in both conserved populations; 5 (4.0?%) were present solely for the conserved CRP; and 3 (2.4?%) were restricted to CR. Consequently, gene pool erosion related to intensive breeding did not occur in Czech Simmental cattle. Similarly, no considerable consequences were found from known bottlenecks in the history of Czech Red cattle. On the other hand, the distinctness of the conserved populations and their potential for resistance breeding were only moderate. This relationship might be transferable to other non-abundant historical cattle breeds that are conserved as genetic resources. The estimates of polymorphism impact using Variant Effect Predictor and SIFT software tools allowed for the identification of candidate single-nucleotide polymorphisms (SNPs) for association studies related to infection resistance and targeted breeding. Knowledge of TLR-gene diversity present in Czech Simmental populations may aid in the potential transfer of variant characteristics from other breeds.


April 21, 2020  |  

Investigating the bacterial microbiota of traditional fermented dairy products using propidium monoazide with single-molecule real-time sequencing.

Traditional fermented dairy foods have been the major components of the Mongolian diet for millennia. In this study, we used propidium monoazide (PMA; binds to DNA of nonviable cells so that only viable cells are enumerated) and single-molecule real-time sequencing (SMRT) technology to investigate the total and viable bacterial compositions of 19 traditional fermented dairy foods, including koumiss from Inner Mongolia (KIM), koumiss from Mongolia (KM), and fermented cow milk from Mongolia (CM); sample groups treated with PMA were designated PKIM, PKM, and PCM. Full-length 16S rRNA sequencing identified 195 bacterial species in 121 genera and 13 phyla in PMA-treated and untreated samples. The PMA-treated and untreated samples differed significantly in their bacterial community composition and a-diversity values. The predominant species in KM, KIM, and CM were Lactobacillus helveticus, Streptococcus parauberis, and Lactobacillus delbrueckii, whereas the predominant species in PKM, PKIM, and PCM were Enterobacter xiangfangensis, Lactobacillus helveticus, and E. xiangfangensis, respectively. Weighted and unweighted principal coordinate analyses showed a clear clustering pattern with good separation and only minor overlapping. In addition, a pure culture method was performed to obtain lactic acid bacteria resources in dairy samples according to the results of SMRT sequencing. A total of 102 LAB strains were identified and Lb. helveticus (68.63%) was the most abundant, in agreement with SMRT sequencing results. Our results revealed that the bacterial communities of traditional dairy foods are complex and vary by type of fermented dairy product. The PMA treatment induced significant changes in bacterial community structure.Copyright © 2019 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.


April 21, 2020  |  

Assessment of the microbial diversity of Chinese Tianshan tibicos by single molecule, real-time sequencing technology.

Chinese Tianshan tibico grains were collected from the rural area of Tianshan in Xinjiang province, China. Typical tibico grains are known to consist of polysaccharide matrix that embeds a variety of bacteria and yeasts. These grains are widely used in some rural regions to produce a beneficial sugary beverage that is slightly acidic and contains low level of alcohol. This work aimed to characterize the microbiota composition of Chinese Tianshan tibicos using the single molecule, real-time sequencing technology, which is advantageous in generating long reads. Our results revealed that the microbiota mainly comprised of the bacterial species of Lactobacillus hilgardii, Lactococcus raffinolactis, Leuconostoc mesenteroides, Zymomonas mobilis, together with a Guehomyces pullulans-dominating fungal community. The data generated in this work helps identify beneficial microbes in Chinese Tianshan tibico grains.


April 21, 2020  |  

Arcobacter cryaerophilus Isolated From New Zealand Mussels Harbor a Putative Virulence Plasmid.

A wide range of Arcobacter species have been described from shellfish in various countries but their presence has not been investigated in Australasia, in which shellfish are a popular delicacy. Since several arcobacters are considered to be emerging pathogens, we undertook a small study to evaluate their presence in several different shellfish, including greenshell mussels, oysters, and abalone (paua) in New Zealand. Arcobacter cryaerophilus, a species associated with human gastroenteritis, was the only species isolated, from greenshell mussels. Whole-genome sequencing revealed a range of genomic traits in these strains that were known or associated virulence factors. Furthermore, we describe the first putative virulence plasmid in Arcobacter, containing lytic, immunoavoidance, adhesion, antibiotic resistance, and gene transfer traits, among others. Complete genome sequence determination using a combination of long- and short-read genome sequencing strategies, was needed to identify the plasmid, clearly identifying its benefits. The potential for plasmids to disseminate virulence traits among Arcobacter and other species warrants further consideration by researchers interested in the risks to public health from these organisms.


April 21, 2020  |  

Whole genome sequence and de novo assembly revealed genomic architecture of Indian Mithun (Bos frontalis).

Mithun (Bos frontalis), also called gayal, is an endangered bovine species, under the tribe bovini with 2n?=?58 XX chromosome complements and reared under the tropical rain forests region of India, China, Myanmar, Bhutan and Bangladesh. However, the origin of this species is still disputed and information on its genomic architecture is scanty so far. We trust that availability of its whole genome sequence data and assembly will greatly solve this problem and help to generate many information including phylogenetic status of mithun. Recently, the first genome assembly of gayal, mithun of Chinese origin, was published. However, an improved reference genome assembly would still benefit in understanding genetic variation in mithun populations reared under diverse geographical locations and for building a superior consensus assembly. We, therefore, performed deep sequencing of the genome of an adult female mithun from India, assembled and annotated its genome and performed extensive bioinformatic analyses to produce a superior de novo genome assembly of mithun.We generated ˜300 Gigabyte (Gb) raw reads from whole-genome deep sequencing platforms and assembled the sequence data using a hybrid assembly strategy to create a high quality de novo assembly of mithun with 96% recovered as per BUSCO analysis. The final genome assembly has a total length of 3.0 Gb, contains 5,015 scaffolds with an N50 value of 1?Mb. Repeat sequences constitute around 43.66% of the assembly. The genomic alignments between mithun to cattle showed that their genomes, as expected, are highly conserved. Gene annotation identified 28,044 protein-coding genes presented in mithun genome. The gene orthologous groups of mithun showed a high degree of similarity in comparison with other species, while fewer mithun specific coding sequences were found compared to those in cattle.Here we presented the first de novo draft genome assembly of Indian mithun having better coverage, less fragmented, better annotated, and constitutes a reasonably complete assembly compared to the previously published gayal genome. This comprehensive assembly unravelled the genomic architecture of mithun to a great extent and will provide a reference genome assembly to research community to elucidate the evolutionary history of mithun across its distinct geographical locations.


April 21, 2020  |  

Improved annotation of the domestic pig genome through integration of Iso-Seq and RNA-seq data.

Our understanding of the pig transcriptome is limited. RNA transcript diversity among nine tissues was assessed using poly(A) selected single-molecule long-read isoform sequencing (Iso-seq) and Illumina RNA sequencing (RNA-seq) from a single White cross-bred pig. Across tissues, a total of 67,746 unique transcripts were observed, including 60.5% predicted protein-coding, 36.2% long non-coding RNA and 3.3% nonsense-mediated decay transcripts. On average, 90% of the splice junctions were supported by RNA-seq within tissue. A large proportion (80%) represented novel transcripts, mostly produced by known protein-coding genes (70%), while 17% corresponded to novel genes. On average, four transcripts per known gene (tpg) were identified; an increase over current EBI (1.9 tpg) and NCBI (2.9 tpg) annotations and closer to the number reported in human genome (4.2 tpg). Our new pig genome annotation extended more than 6000 known gene borders (5′ end extension, 3′ end extension, or both) compared to EBI or NCBI annotations. We validated a large proportion of these extensions by independent pig poly(A) selected 3′-RNA-seq data, or human FANTOM5 Cap Analysis of Gene Expression data. Further, we detected 10,465 novel genes (81% non-coding) not reported in current pig genome annotations. More than 80% of these novel genes had transcripts detected in >?1 tissue. In addition, more than 80% of novel intergenic genes with at least one transcript detected in liver tissue had H3K4me3 or H3K36me3 peaks mapping to their promoter and gene body, respectively, in independent liver chromatin immunoprecipitation data. These validated results show significant improvement over current pig genome annotations.


April 21, 2020  |  

Chromosome-level assembly of the water buffalo genome surpasses human and goat genomes in sequence contiguity.

Rapid innovation in sequencing technologies and improvement in assembly algorithms have enabled the creation of highly contiguous mammalian genomes. Here we report a chromosome-level assembly of the water buffalo (Bubalus bubalis) genome using single-molecule sequencing and chromatin conformation capture data. PacBio Sequel reads, with a mean length of 11.5?kb, helped to resolve repetitive elements and generate sequence contiguity. All five B. bubalis sub-metacentric chromosomes were correctly scaffolded with centromeres spanned. Although the index animal was partly inbred, 58% of the genome was haplotype-phased by FALCON-Unzip. This new reference genome improves the contig N50 of the previous short-read based buffalo assembly more than a thousand-fold and contains only 383 gaps. It surpasses the human and goat references in sequence contiguity and facilitates the annotation of hard to assemble gene clusters such as the major histocompatibility complex (MHC).


April 21, 2020  |  

Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation.

We describe a method that adds long-read sequencing to a mix of technologies used to assemble a highly complex cattle rumen microbial community, and provide a comparison to short read-based methods. Long-read alignments and Hi-C linkage between contigs support the identification of 188 novel virus-host associations and the determination of phage life cycle states in the rumen microbial community. The long-read assembly also identifies 94 antimicrobial resistance genes, compared to only seven alleles in the short-read assembly. We demonstrate novel techniques that work synergistically to improve characterization of biological features in a highly complex rumen microbial community.


April 21, 2020  |  

Long-read based de novo assembly of low-complexity metagenome samples results in finished genomes and reveals insights into strain diversity and an active phage system.

Complete and contiguous genome assemblies greatly improve the quality of subsequent systems-wide functional profiling studies and the ability to gain novel biological insights. While a de novo genome assembly of an isolated bacterial strain is in most cases straightforward, more informative data about co-existing bacteria as well as synergistic and antagonistic effects can be obtained from a direct analysis of microbial communities. However, the complexity of metagenomic samples represents a major challenge. While third generation sequencing technologies have been suggested to enable finished metagenome-assembled genomes, to our knowledge, the complete genome assembly of all dominant strains in a microbiome sample has not been demonstrated. Natural whey starter cultures (NWCs) are used in cheese production and represent low-complexity microbiomes. Previous studies of Swiss Gruyère and selected Italian hard cheeses, mostly based on amplicon metagenomics, concurred that three species generally pre-dominate: Streptococcus thermophilus, Lactobacillus helveticus and Lactobacillus delbrueckii.Two NWCs from Swiss Gruyère producers were subjected to whole metagenome shotgun sequencing using the Pacific Biosciences Sequel and Illumina MiSeq platforms. In addition, longer Oxford Nanopore Technologies MinION reads had to be generated for one to resolve repeat regions. Thereby, we achieved the complete assembly of all dominant bacterial genomes from these low-complexity NWCs, which was corroborated by a 16S rRNA amplicon survey. Moreover, two distinct L. helveticus strains were successfully co-assembled from the same sample. Besides bacterial chromosomes, we could also assemble several bacterial plasmids and phages and a corresponding prophage. Biologically relevant insights were uncovered by linking the plasmids and phages to their respective host genomes using DNA methylation motifs on the plasmids and by matching prokaryotic CRISPR spacers with the corresponding protospacers on the phages. These results could only be achieved by employing long-read sequencing data able to span intragenomic as well as intergenomic repeats.Here, we demonstrate the feasibility of complete de novo genome assembly of all dominant strains from low-complexity NWCs based on whole metagenomics shotgun sequencing data. This allowed to gain novel biological insights and is a fundamental basis for subsequent systems-wide omics analyses, functional profiling and phenotype to genotype analysis of specific microbial communities.


April 21, 2020  |  

CAMISIM: simulating metagenomes and microbial communities.

Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required.We describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series, and differential abundance studies, includes real and simulated strain-level diversity, and generates second- and third-generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes, we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT, and metaSPAdes, on several thousand small data sets generated with CAMISIM.CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with standards of truth for method evaluation. All data sets and the software are freely available at https://github.com/CAMI-challenge/CAMISIM.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.