Structural variation Archives - Page 24 of 31

July 7, 2019

Long-read sequencing offers path to more accurate drug metabolism profiles

In the complex drug discovery process, one of the looming questions for any new compound is how it will be metabolised in a human bodyWhi|e there are several methods for evaluating this, one of the most common involves CYP2D6,the enzyme encoded by the cytochrome P450—2D6 gene.This enzyme is involved in metabolising a quarter of all commonly used medications, making it an important target for ADME and pharmacogenomics studies. It is known to activate some drugs and to play a role in the deactivation or excretion of others.

July 7, 2019

The evolution of the natural killer complex; a comparison between mammals using new high-quality genome assemblies and targeted annotation.

Natural killer (NK) cells are a diverse population of lymphocytes with a range of biological roles including essential immune functions. NK cell diversity is in part created by the differential expression of cell surface receptors which modulate activation and function, including multiple subfamilies of C-type lectin receptors encoded within the NK complex (NKC). Little is known about the gene content of the NKC beyond rodent and primate lineages, other than it appears to be extremely variable between mammalian groups. We compared the NKC structure between mammalian species using new high-quality draft genome assemblies for cattle and goat; re-annotated sheep, pig, and horse genome assemblies; and the published human, rat, and mouse lemur NKC. The major NKC genes are largely in the equivalent positions in all eight species, with significant independent expansions and deletions between species, allowing us to propose a model for NKC evolution during mammalian radiation. The ruminant species, cattle and goats, have independently evolved a second KLRC locus flanked by KLRA and KLRJ, and a novel KLRH-like gene has acquired an activating tail. This novel gene has duplicated several times within cattle, while other activating receptor genes have been selectively disrupted. Targeted genome enrichment in cattle identified varying levels of allelic polymorphism between the NKC genes concentrated in the predicted extracellular ligand-binding domains. This novel recombination and allelic polymorphism is consistent with NKC evolution under balancing selection, suggesting that this diversity influences individual immune responses and may impact on differential outcomes of pathogen infection and vaccination.

July 7, 2019

Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments.

Pearl millet [Cenchrus americanus (L.) Morrone] is a staple food for more than 90 million farmers in arid and semi-arid regions of sub-Saharan Africa, India and South Asia. We report the ~1.79 Gb draft whole genome sequence of reference genotype Tift 23D2B1-P1-P5, which contains an estimated 38,579 genes. We highlight the substantial enrichment for wax biosynthesis genes, which may contribute to heat and drought tolerance in this crop. We resequenced and analyzed 994 pearl millet lines, enabling insights into population structure, genetic diversity and domestication. We use these resequencing data to establish marker trait associations for genomic selection, to define heterotic pools, and to predict hybrid performance. We believe that these resources should empower researchers and breeders to improve this important staple crop.

July 7, 2019

The complete genome sequence of Streptomyces autolyticus CGMCC 0516, the producer of geldanamycin, autolytimycin, reblastatin and elaiophylin.

Streptomyces autolyticus CGMCC 0516 produces the anti-tumor benzoquinone ansamycins geldanamycin, autolytimycin, and reblastatin and the 16-membered macrodiolide elaiophylin. Here, we report the complete genome sequence of S. autolyticus CGMCC 0516, which consists of a 10,029,028bp linear chromosome and seven circular plasmids. Fifty-seven putative biosynthetic gene clusters for secondary metabolites were found. The geldanamycin, autolytimycin, and reblastatin biosynthetic gene clusters were located on the left arm (2.06-2.15Mb) of the chromosome, and the elaiophylin gene cluster was located on the right arm (9.45-9.53Mb). Twenty-one putative gene clusters with high or moderate similarity to important antibiotic biosynthetic gene clusters were found, including the antitumor agents echoside, bafilomycin, hygrocin, and toxoflavin; the antibacterial/antifungal agents nigericin, skyllamycin, kanamycin, naphthomycin, eco-02301, and bottromycin A2; the immunosuppressants meridamycin and brasilicardin A; the anti-inflammatory agent cyclooctatin; and the acute iron poisoning medication desferrioxamine B. The genome sequence reported here will enable us to study the biosynthetic mechanism of these important antibiotics and will facilitate the discovery of novel secondary metabolites with potential applications to human health. Copyright © 2017 Elsevier B.V. All rights reserved.

July 7, 2019

Bacteriophages are the major drivers of Shigella flexneri serotype 1c genome plasticity: a complete genome analysis.

Shigella flexneri is the primary cause of bacillary dysentery in the developing countries. S. flexneri serotype 1c is a novel serotype, which is found to be endemic in many developing countries, but little is known about its genomic architecture and virulence signatures. We have sequenced for the first time, the complete genome of S. flexneri serotype 1c strain Y394, to provide insights into its diversity and evolution.We generated a high-quality reference genome of S. flexneri serotype 1c using the hybrid methods of long-read single-molecule real-time (SMRT) sequencing technology and short-read MiSeq (Illumina) sequencing technology. The Y394 chromosome is 4.58 Mb in size and shares the basic genomic features with other S. flexneri complete genomes. However, it possesses unique and highly modified O-antigen structure comprising of three distinct O-antigen modifying gene clusters that potentially came from three different bacteriophages. It also possesses a large number of hypothetical unique genes compared to other S. flexneri genomes.Despite a high level of structural and functional similarities of Y394 genome with other S. flexneri genomes, there are marked differences in the pathogenic islands. The diversity in the pathogenic islands suggests that these bacterial pathogens are well adapted to respond to the selection pressures during their evolution, which might contribute to the differences in their virulence potential.

July 7, 2019

XCAVATOR: accurate detection and genotyping of copy number variants from second and third generation whole-genome sequencing experiments.

We developed a novel software package, XCAVATOR, for the identification of genomic regions involved in copy number variants/alterations (CNVs/CNAs) from short and long reads whole-genome sequencing experiments.By using simulated and real datasets we showed that our tool, based on read count approach, is capable to predict the boundaries and the absolute number of DNA copies CNVs/CNAs with high resolutions. To demonstrate the power of our software we applied it to the analysis Illumina and Pacific Bioscencies data and we compared its performance to other ten state of the art tools.All the analyses we performed demonstrate that XCAVATOR is capable to detect germline and somatic CNVs/CNAs outperforming all the other tools we compared. XCAVATOR is freely available at http://sourceforge.net/projects/xcavator/ .

July 7, 2019

SVachra: a tool to identify genomic structural variation in mate pair sequencing data containing inward and outward facing reads.

Characterization of genomic structural variation (SV) is essential to expanding the research and clinical applications of genome sequencing. Reliance upon short DNA fragment paired end sequencing has yielded a wealth of single nucleotide variants and internal sequencing read insertions-deletions, at the cost of limited SV detection. Multi-kilobase DNA fragment mate pair sequencing has supplemented the void in SV detection, but introduced new analytic challenges requiring SV detection tools specifically designed for mate pair sequencing data. Here, we introduce SVachra – Structural Variation Assessment of CHRomosomal Aberrations, a breakpoint calling program that identifies large insertions-deletions, inversions, inter- and intra-chromosomal translocations utilizing both inward and outward facing read types generated by mate pair sequencing.We demonstrate SVachra’s utility by executing the program on large-insert (Illumina Nextera) mate pair sequencing data from the personal genome of a single subject (HS1011). An additional data set of long-read (Pacific BioSciences RSII) was also generated to validate SV calls from SVachra and other comparison SV calling programs. SVachra exhibited the highest validation rate and reported the widest distribution of SV types and size ranges when compared to other SV callers.SVachra is a highly specific breakpoint calling program that exhibits a more unbiased SV detection methodology than other callers.

July 7, 2019

The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology.

Mobile element insertions (MEIs) represent ~25% of all structural variants in human genomes. Moreover, when they disrupt genes, MEIs can influence human traits and diseases. Therefore, MEIs should be fully discovered along with other forms of genetic variation in whole genome sequencing (WGS) projects involving population genetics, human diseases, and clinical genomics. Here, we describe the Mobile Element Locator Tool (MELT), which was developed as part of the 1000 Genomes Project to perform MEI discovery on a population scale. Using both Illumina WGS data and simulations, we demonstrate that MELT outperforms existing MEI discovery tools in terms of speed, scalability, specificity, and sensitivity, while also detecting a broader spectrum of MEI-associated features. Several run modes were developed to perform MEI discovery on local and cloud systems. In addition to using MELT to discover MEIs in modern humans as part of the 1000 Genomes Project, we also used it to discover MEIs in chimpanzees and ancient (Neanderthal and Denisovan) hominids. We detected diverse patterns of MEI stratification across these populations that likely were caused by (1) diverse rates of MEI production from source elements, (2) diverse patterns of MEI inheritance, and (3) the introgression of ancient MEIs into modern human genomes. Overall, our study provides the most comprehensive map of MEIs to date spanning chimpanzees, ancient hominids, and modern humans and reveals new aspects of MEI biology in these lineages. We also demonstrate that MELT is a robust platform for MEI discovery and analysis in a variety of experimental settings.© 2017 Gardner et al.; Published by Cold Spring Harbor Laboratory Press.

July 7, 2019

Building a locally diploid genome and transcriptome of the diatom Fragilariopsis cylindrus.

The genome of the cold-adapted diatom Fragilariopsis cylindrus is characterized by highly diverged haplotypes that intersperse its homozygous genome. Here, we describe how a combination of PacBio DNA and Illumina RNA sequencing can be used to resolve this complex genomic landscape locally into the highly diverged haplotypes, and how to map various environmentally controlled transcripts onto individual haplotypes. We assembled PacBio sequence data with the FALCON assembler and created a haplotype resolved annotation of the assembly using annotations of a Sanger sequenced F. cylindrus genome. RNA-seq datasets from six different growth conditions were used to resolve allele-specifc gene expression in F. cylindrus. This approach enables to study differential expression of alleles in a complex genomic landscape and provides a useful tool to study how diverged haplotypes in diploid organisms are used for adaptation and evolution to highly variable environments.

July 7, 2019

Structural variation offers new home for disease associations and gene discovery

Following completion of the Human Genome Project, most studies of human genetic variation have centered on single nucleotide polymorphisms (SNPs). SNPs are numerous in individual genomes and serve as useful genetic markers in association studies across a population. These markers have been leveraged to identify genetic loci for disease risk and draw associations with numerous traits of interest. Despite their usefulness, SNPs do not tell the whole story. For example, most SNPs are associated with only a small increased risk of disease, and they usually cannot identify on their own which genes are causal. This has resulted in what many researchers have referred to as missing or hidden heritability.

July 7, 2019

Lightning-fast genome variant detection with GROM.

Current human whole genome sequencing projects produce massive amounts of data, often creating significant computational challenges. Different approaches have been developed for each type of genome variant and method of its detection, necessitating users to run multiple algorithms to find variants.We present GROM (Genome Rearrangement OmniMapper), a novel comprehensive variant detection algorithm accepting aligned read files as input and finding SNVs, indels, structural variants (SVs), and copy number variants (CNVs). We show that GROM outperforms state-of-the-art methods on seven validated benchmarks using two whole genome sequencing (WGS) datasets. Additionally, GROM boasts lightning fast run times, analyzing a 50x WGS human dataset (NA12878) on commonly available computer hardware in 11 minutes, more than an order of magnitude (up to 72 times) faster than tools detecting a similar range of variants.Addressing the needs of big data analysis, GROM combines in one algorithm SNV, indel, SV, and CNV detection providing superior speed, sensitivity, and precision. GROM is also able to detect CNVs, SNVs and indels in non-paired read WGS libraries, as well as SNVs and indels in whole exome or RNA sequencing datasets.

July 7, 2019

The complete genome sequence of Streptomyces albolongus YIM 101047, the producer of novel bafilomycins and odoriferous sesquiterpenoids.

Streptomyces albolongus YIM 101047 produces novel bafilomycins and odoriferous sesquiterpenoids with cytotoxic and antimicrobial activities. Here, we report the complete genome sequence of S. albolongus YIM 101047, which consists of an 8,027,788bp linear chromosome. Forty-six putative biosynthetic gene clusters of secondary metabolites were found. The sesquiterpenoid gene cluster was on the left arm (0.09-0.10Mb), and the bafilomycin biosynthetic gene cluster was on the right arm (7.46-7.64Mb) of the chromosome. Twenty-two putative gene clusters with high or moderate similarity to important antibiotic biosynthetic gene clusters were found, including the antitumor agents bafilomycin, epothilone and hedamycin; the antibacterial/antifungal agents clavulanic acid, collismycin A, frontalamides, kanamycin, streptomycin and streptothricin; the protein phosphatase inhibitor RK-682; and the acute iron poisoning medication desferrioxamine B. The genome sequence reported here will enable us to study the biosynthetic mechanism of these important antibiotics and will facilitate the discovery of novel secondary metabolites with potential applications to human health. Copyright © 2017 Elsevier B.V. All rights reserved.

July 7, 2019

Harnessing whole genome sequencing in medical mycology.

Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens.Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host.Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.

July 7, 2019

GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly.

The identification of genomic rearrangements with high sensitivity and specificity using massively parallel sequencing remains a major challenge, particularly in precision medicine and cancer research. Here, we describe a new method for detecting rearrangements, GRIDSS (Genome Rearrangement IDentification Software Suite). GRIDSS is a multithreaded structural variant (SV) caller that performs efficient genome-wide break-end assembly prior to variant calling using a novel positional de Bruijn graph-based assembler. By combining assembly, split read, and read pair evidence using a probabilistic scoring, GRIDSS achieves high sensitivity and specificity on simulated, cell line, and patient tumor data, recently winning SV subchallenge #5 of the ICGC-TCGA DREAM8.5 Somatic Mutation Calling Challenge. On human cell line data, GRIDSS halves the false discovery rate compared to other recent methods while matching or exceeding their sensitivity. GRIDSS identifies nontemplate sequence insertions, microhomologies, and large imperfect homologies, estimates a quality score for each breakpoint, stratifies calls into high or low confidence, and supports multisample analysis.© 2017 Cameron et al.; Published by Cold Spring Harbor Laboratory Press.

July 7, 2019

Large-scale suppression of recombination predates genomic rearrangements in Neurospora tetrasperma.

A common feature of eukaryote genomes is large chromosomal regions where recombination is absent or strongly reduced, but the factors that cause this reduction are not well understood. Genomic rearrangements have often been implicated, but they may also be a consequence of recombination suppression rather than a cause. In this study, we generate eight high-quality genomic data sets of the filamentous ascomycete Neurospora tetrasperma, a fungus that lacks recombination over most of its largest chromosome. The genomes surprisingly reveal collinearity of the non-recombining regions and although large inversions are enriched in these regions, we conclude these inversions to be derived and not the cause of the suppression. To our knowledge, this is the first time that non-recombining, genic regions as large as 86% of a full chromosome (or 8?Mbp), are shown to be collinear. These findings are of significant interest for our understanding of the evolution of sex chromosomes and other supergene complexes.

Auto Tag: Structural variation

Long-read sequencing offers path to more accurate drug metabolism profiles

The evolution of the natural killer complex; a comparison between mammals using new high-quality genome assemblies and targeted annotation.

Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments.

The complete genome sequence of Streptomyces autolyticus CGMCC 0516, the producer of geldanamycin, autolytimycin, reblastatin and elaiophylin.

Bacteriophages are the major drivers of Shigella flexneri serotype 1c genome plasticity: a complete genome analysis.

XCAVATOR: accurate detection and genotyping of copy number variants from second and third generation whole-genome sequencing experiments.

SVachra: a tool to identify genomic structural variation in mate pair sequencing data containing inward and outward facing reads.

The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology.

Building a locally diploid genome and transcriptome of the diatom Fragilariopsis cylindrus.

Structural variation offers new home for disease associations and gene discovery

Lightning-fast genome variant detection with GROM.

The complete genome sequence of Streptomyces albolongus YIM 101047, the producer of novel bafilomycins and odoriferous sesquiterpenoids.

Harnessing whole genome sequencing in medical mycology.

GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly.

Large-scale suppression of recombination predates genomic rearrangements in Neurospora tetrasperma.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert