Large genome Archives - Page 12 of 69

April 21, 2020

Long-read sequencing identified intronic repeat expansions in SAMD12 from Chinese pedigrees affected with familial cortical myoclonic tremor with epilepsy.

The locus for familial cortical myoclonic tremor with epilepsy (FCMTE) has long been mapped to 8q24 in linkage studies, but the causative mutations remain unclear. Recently, expansions of intronic TTTCA and TTTTA repeat motifs within SAMD12 were found to be involved in the pathogenesis of FCMTE in Japanese pedigrees. We aim to identify the causative mutations of FCMTE in Chinese pedigrees.We performed genetic linkage analysis by microsatellite markers in a five-generation Chinese pedigree with 55 members. We also used array-comparative genomic hybridisation (CGH) and next-generation sequencing (NGS) technologies (whole-exome sequencing, capture region deep sequencing and whole-genome sequencing) to identify the causative mutations in the disease locus. Recently, we used low-coverage (~10×) long-read genome sequencing (LRS) on the PacBio Sequel and Oxford Nanopore platforms to identify the causative mutations, and used repeat-primed PCR for validation of the repeat expansions.Linkage analysis mapped the disease locus to 8q23.3-24.23. Array-CGH and NGS failed to identify causative mutations in this locus. LRS identified the intronic TTTCA and TTTTA repeat expansions in SAMD12 as the causative mutations, thus corroborating the recently published results in Japanese pedigrees.We identified the pentanucleotide repeat expansion in SAMD12 as the causative mutation in Chinese FCMTE pedigrees. Our study also suggested that LRS is an effective tool for molecular diagnosis of genetic disorders, especially for neurological diseases that cannot be positively diagnosed by conventional clinical microarray and NGS technologies. © Author(s) (or their employer(s)) 2019. No commercial re-use. See rights and permissions. Published by BMJ.

April 21, 2020

Long-read sequence capture of the haemoglobin gene clusters across codfish species.

Combining high-throughput sequencing with targeted sequence capture has become an attractive tool to study specific genomic regions of interest. Most studies have so far focused on the exome using short-read technology. These approaches are not designed to capture intergenic regions needed to reconstruct genomic organization, including regulatory regions and gene synteny. Here, we demonstrate the power of combining targeted sequence capture with long-read sequencing technology for comparative genomic analyses of the haemoglobin (Hb) gene clusters across eight species separated by up to 70 million years. Guided by the reference genome assembly of the Atlantic cod (Gadus morhua) together with genome information from draft assemblies of selected codfishes, we designed probes covering the two Hb gene clusters. Use of custom-made barcodes combined with PacBio RSII sequencing led to highly continuous assemblies of the LA (~100 kb) and MN (~200 kb) clusters, which include syntenic regions of coding and intergenic sequences. Our results revealed an overall conserved genomic organization of the Hb genes within this lineage, yet with several, lineage-specific gene duplications. Moreover, for some of the species examined, we identified amino acid substitutions at two sites in the Hbb1 gene as well as length polymorphisms in its regulatory region, which has previously been linked to temperature adaptation in Atlantic cod populations. This study highlights the use of targeted long-read capture as a versatile approach for comparative genomic studies by generation of a cross-species genomic resource elucidating the evolutionary history of the Hb gene family across the highly divergent group of codfishes. © 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

April 21, 2020

The complete chloroplast genome sequence of watercress (Nasturtium officinale R. Br.): Genome organization, adaptive evolution and phylogenetic relationships in Cardamineae.

Watercress (Nasturtium officinale R. Br.), an aquatic leafy vegetable of the Brassicaceae family, is known as a nutritional powerhouse. Here, we de novo sequenced and assembled the complete chloroplast (cp) genome of watercress based on combined PacBio and Illumina data. The cp genome is 155,106?bp in length, exhibiting a typical quadripartite structure including a pair of inverted repeats (IRA and IRB) of 26,505?bp separated by a large single copy (LSC) region of 84,265?bp and a small single copy (SSC) region of 17,831?bp. The genome contained 113 unique genes, including 79 protein-coding genes, 30 tRNAs and 4 rRNAs, with 20 duplicate in the IRs. Compared with the prior cp genome of watercress deposited in GenBank, 21 single nucleotide polymorphisms (SNPs) and 27 indels were identified, mainly located in noncoding sequences. A total of 49 repeat structures and 71 simple sequence repeats (SSRs) were detected. Codon usage showed a bias for A/T-ending codons in the cp genome of watercress. Moreover, 45 RNA editing sites were predicted in 16 genes, all for C-to-U transitions. A comparative plastome study with Cardamineae species revealed a conserved gene order and high similarity of protein-coding sequences. Analysis of the Ka/Ks ratios of Cardamineae suggested positive selection exerted on the ycf2 gene in watercress, which might reflect specific adaptations of watercress to its particular living environment. Phylogenetic analyses based on complete cp genomes and common protein-coding genes from 56 species showed that the genus Nasturtium was a sister to Cardamine in the Cardamineae tribe. Our study provides valuable resources for future evolution, population genetics and molecular biology studies of watercress. Copyright © 2019 Elsevier B.V. All rights reserved.

April 21, 2020

High Quality Draft Genome of Arogyapacha (Trichopus zeylanicus), an Important Medicinal Plant Endemic to Western Ghats of India.

Arogyapacha, the local name of Trichopus zeylanicus, is a rare, indigenous medicinal plant of India. This plant is famous for its traditional use as an instant energy stimulant. So far, no genomic resource is available for this important plant and hence its metabolic pathways are poorly understood. Here, we report on a high-quality draft assembly of approximately 713.4 Mb genome of T. zeylanicus, first draft genome from the genus Trichopus The assembly was generated in a hybrid approach using Illumina short-reads and Pacbio longer-reads. The total assembly comprised of 22601 scaffolds with an N50 value of 433.3 Kb. We predicted 34452 protein coding genes in T. zeylanicus genome and found that a significant portion of these predicted genes were associated with various secondary metabolite biosynthetic pathways. Comparative genome analysis revealed extensive gene collinearity between T. zeylanicus and its closely related plant species. The present genome and annotation data provide an essential resource to speed-up the research on secondary metabolism, breeding and molecular evolution of T. zeylanicus. Copyright © 2019 Chellappan et al.

April 21, 2020

De Novo Genome Sequence Assembly of Dwarf Coconut (Cocos nucifera L. ‘Catigan Green Dwarf’) Provides Insights into Genomic Variation Between Coconut Types and Related Palm Species.

We report the first whole genome sequence (WGS) assembly and annotation of a dwarf coconut variety, ‘Catigan Green Dwarf’ (CATD). The genome sequence was generated using the PacBio SMRT sequencing platform at 15X coverage of the expected genome size of 2.15 Gbp, which was corrected with assembled 50X Illumina paired-end MiSeq reads of the same genome. The draft genome was improved through Chicago sequencing to generate a scaffold assembly that results in a total genome size of 2.1 Gbp consisting of 7,998 scaffolds with N50 of 570,487 bp. The final assembly covers around 97.6% of the estimated genome size of coconut ‘CATD’ based on homozygous k-mer peak analysis. A total of 34,958 high-confidence gene models were predicted and functionally associated to various economically important traits, such as pest/disease resistance, drought tolerance, coconut oil biosynthesis, and putative transcription factors. The assembled genome was used to infer the evolutionary relationship within the palm family based on genomic variations and synteny of coding gene sequences. Data show that at least three (3) rounds of whole genome duplication occurred and are commonly shared by these members of the Arecaceae family. A total of 7,139 unique SSR markers were designed to be used as a resource in marker-based breeding. In addition, we discovered 58,503 variants in coconut by aligning the Hainan Tall (HAT) WGS reads to the non-repetitive regions of the assembled CATD genome. The gene markers and genome-wide SSR markers established here will facilitate the development of varieties with resilience to climate change, resistance to pests and diseases, and improved oil yield and quality.Copyright © 2019 Lantican et al.

April 21, 2020

Hybrid Genome Assembly of a Neotropical Mutualistic Ant.

The success of social insects is largely intertwined with their highly advanced chemical communication system that facilitates recognition and discrimination of species and nest-mates, recruitment, and division of labor. Hydrocarbons, which cover the cuticle of insects, not only serve as waterproofing agents but also constitute a major component of this communication system. Two cryptic Crematogaster species, which share their nest with Camponotus ants, show striking diversity in their cuticular hydrocarbon (CHC) profile. This mutualistic system therefore offers a great opportunity to study the genetic basis of CHC divergence between sister species. As a basis for further genome-wide studies high-quality genomes are needed. Here, we present the annotated draft genome for Crematogaster levior A. By combining the three most commonly used sequencing techniques-Illumina, PacBio, and Oxford Nanopore-we constructed a high-quality de novo ant genome. We show that even low coverage of long reads can add significantly to overall genome contiguity. Annotation of desaturase and elongase genes, which play a role in CHC biosynthesis revealed one of the largest repertoires in ants and a higher number of desaturases in general than in other Hymenoptera. This may provide a mechanistic explanation for the high diversity observed in C. levior CHC profiles. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

April 21, 2020

Genome sequencing and CRISPR/Cas9 gene editing of an early flowering Mini-Citrus (Fortunella hindsii).

Hongkong kumquat (Fortunella hindsii) is a wild citrus species characterized by dwarf plant height and early flowering. Here, we identified the monoembryonic F. hindsii (designated as ‘Mini-Citrus’) for the first time and constructed its selfing lines. This germplasm constitutes an ideal model for the genetic and functional genomics studies of citrus, which have been severely hindered by the long juvenility and inherent apomixes of citrus. F. hindsii showed a very short juvenile period (~8 months) and stable monoembryonic phenotype under cultivation. We report the first de novo assembled 373.6 Mb genome sequences (Contig-N50 2.2 Mb and Scaffold-N50 5.2 Mb) for F. hindsii. In total, 32 257 protein-coding genes were annotated, 96.9% of which had homologues in other eight Citrinae species. The phylogenomic analysis revealed a close relationship of F. hindsii with cultivated citrus varieties, especially with mandarin. Furthermore, the CRISPR/Cas9 system was demonstrated to be an efficient strategy to generate target mutagenesis on F. hindsii. The modifications of target genes in the CRISPR-modified F. hindsii were predominantly 1-bp insertions or small deletions. This genetic transformation system based on F. hindsii could shorten the whole process from explant to T1 mutant to about 15 months. Overall, due to its short juvenility, monoembryony, close genetic background to cultivated citrus and applicability of CRISPR, F. hindsii shows unprecedented potentials to be used as a model species for citrus research. © 2019 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

April 21, 2020

Medaka Population Genome Structure and Demographic History Described via Genotyping-by-Sequencing.

Medaka is a model organism in medicine, genetics, developmental biology and population genetics. Lab stocks composed of more than 100 local wild populations are available for research in these fields. Thus, medaka represents a potentially excellent bioresource for screening disease-risk- and adaptation-related genes in genome-wide association studies. Although the genetic population structure should be known before performing such an analysis, a comprehensive study on the genome-wide diversity of wild medaka populations has not been performed. Here, we performed genotyping-by-sequencing (GBS) for 81 and 12 medakas captured from a bioresource and the wild, respectively. Based on the GBS data, we evaluated the genetic population structure and estimated the demographic parameters using an approximate Bayesian computation (ABC) framework. The genome-wide data confirmed that there were substantial differences between local populations and supported our previously proposed hypothesis on medaka dispersal based on mitochondrial genome (mtDNA) data. A new finding was that a local group that was thought to be a hybrid between the northern and the southern Japanese groups was actually an origin of the northern Japanese group. Thus, this paper presents the first population-genomic study of medaka and reveals its population structure and history based on chromosomal genetic diversity.Copyright © 2019 by the Genetics Society of America.

April 21, 2020

Multiple modes of convergent adaptation in the spread of glyphosate-resistant Amaranthus tuberculatus.

The selection pressure exerted by herbicides has led to the repeated evolution of herbicide resistance in weeds. The evolution of herbicide resistance on contemporary timescales in turn provides an outstanding opportunity to investigate key questions about the genetics of adaptation, in particular the relative importance of adaptation from new mutations, standing genetic variation, or geographic spread of adaptive alleles through gene flow. Glyphosate-resistant Amaranthus tuberculatus poses one of the most significant threats to crop yields in the Midwestern United States, with both agricultural populations and herbicide resistance only recently emerging in Canada. To understand the evolutionary mechanisms driving the spread of resistance, we sequenced and assembled the A. tuberculatus genome and investigated the origins and population genomics of 163 resequenced glyphosate-resistant and susceptible individuals from Canada and the United States. In Canada, we discovered multiple modes of convergent evolution: in one locality, resistance appears to have evolved through introductions of preadapted US genotypes, while in another, there is evidence for the independent evolution of resistance on genomic backgrounds that are historically nonagricultural. Moreover, resistance on these local, nonagricultural backgrounds appears to have occurred predominantly through the partial sweep of a single haplotype. In contrast, resistant haplotypes arising from the Midwestern United States show multiple amplification haplotypes segregating both between and within populations. Therefore, while the remarkable species-wide diversity of A. tuberculatus has facilitated geographic parallel adaptation of glyphosate resistance, more recently established agricultural populations are limited to adaptation in a more mutation-limited framework.Copyright © 2019 the Author(s). Published by PNAS.

April 21, 2020

The major histocompatibility complex of Old World camelids: Class I and class I-related genes.

The genomic structure of the Major Histocompatibility Complex (MHC) region and variation in selected MHC class I related genes in Old World camels, Camelus bactrianus and Camelus dromedaries were studied. The overall genomic organization of the camel MHC region follows a general pattern observed in other mammalian species and individual MHC loci appear to be well conserved. Selected MHC class I genes B-67 and BL3-7 exhibited unexpectedly low variability, even when compared to other camel MHC class I related genes MR1 and MICA. Interspecific SNP and allele sharing are relatively common, and frequencies of heterozygotes are usually low. Such a low variation in a genomic region generally considered as one of the most polymorphic in vertebrate genomes is unusual. Evolutionary relationships between MHC class I related genes and their counterparts from other species seem to be rather complex. Often, they do not follow the general evolutionary history of the species concerned. Close evolutionary relationships of individual MHC class I loci between camels, humans and dogs were observed. Based on the results of this study and on our data on MHC class II genes, the extent and the pattern of polymorphism of the MHC region of Old World camelids differed from most mammalian groups studied so far. Camels thus seem to be an important model for our understanding of the role of genetic diversity in immune functions, especially in the context of unique features of their immunoglobulin and T-cell receptor genes. © 2019 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

April 21, 2020

Mate Selection in Self-Compatible Wild Tobacco Results from Coordinated Variation in Homologous Self-Incompatibility Genes.

In flowering plants, intraspecific mate preference is frequently related to mating systems: the rejection of self pollen in self-incompatible (SI) plants that prevents inbreeding is one of the best described examples. However, in other mating systems, more nuanced patterns of pollen rejection occur. In the self-compatible (SC) Nicotiana attenuata, in which SI is not found and all crosses are compatible, certain pollen genotypes are consistently selected in mixed pollinations. However, the molecular mechanisms of this polyandrous mate selection remain unknown. Style-expressed NaS-like-RNases and pollen-expressed NaSLF-like genes, homologous to SI factors in Solanaceae, were identified and examined for a role in N. attenuata’s mate selection. A comparison of two NaS-like-RNases and six NaSLF-like genes among 26 natural accessions revealed specific combinations of co-expression and direct protein-protein interactions. To evaluate their role in mate selection, we silenced the expression of specific NaS-like-RNases and NaSLF-like proteins and conducted diagnostic binary mixed pollinations and mixed pollinations with 14 different non-self pollen donors. Styles expressing particular combinations of NaS-like-RNases selected mates from plants with corresponding NaS-like-RNase expression patterns, while styles lacking NaS-like-RNase expression were non-selective in their fertilizations, which reflected the genotype ratios of pollen mixtures deposited on the stigmas. DNA methylation could account for some of the observed variation in stylar NaS-like-RNase patterns. We conclude that the S-RNase-SLF recognition mechanism plays a central role in polyandrous mate selection in this self-compatible species. These results suggest that after the SI-SC transition, natural variation of SI homologous genes was repurposed to mediate intraspecific mate selection. Copyright © 2019 Elsevier Ltd. All rights reserved.

April 21, 2020

The complete mitochondrial genome of the tartar Sand Boa Eryx tataricus

Eryx is a genus of snakes belonging to the family Boidae. In this study, the mitochondrial genome sequence of Eryx tataricus was generated using a PacBio RSII DNA sequencer employing the single mol- ecule, real-time sequencing technology. A maximum-likelihood (ML) phylogenetic tree of 26 snakes was re-constructed based on the 13 protein-coding genes for convincing the mitochondrial DNA sequences.

April 21, 2020

The Genome Sequence of the Anthelmintic-Susceptible New Zealand Haemonchus contortus.

Internal parasitic nematodes are a global animal health issue causing drastic losses in livestock. Here, we report a H. contortus representative draft genome to serve as a genetic resource to the scientific community and support future experimental research of molecular mechanisms in related parasites. A de novo hybrid assembly was generated from PCR-free whole genome sequence data, resulting in a chromosome-level assembly that is 465 Mb in size encoding 22,341 genes. The genome sequence presented here is consistent with the genome architecture of the existing Haemonchus species and is a valuable resource for future studies regarding population genetic structures of parasitic nematodes. Additionally, comparative pan-genomics with other species of economically important parasitic nematodes have revealed highly open genomes and strong collinearities within the phylum Nematoda. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

April 21, 2020

Structural and functional characterization of an intradiol ring-cleavage dioxygenase from the polyphagous spider mite herbivore Tetranychus urticae Koch.

Genome analyses of the polyphagous spider mite herbivore Tetranychus urticae (two-spotted spider mite) revealed the presence of a set of 17 genes that code for secreted proteins belonging to the “intradiol dioxygenase-like” subgroup. Phylogenetic analyses indicate that this novel enzyme family has been acquired by horizontal gene transfer. In order to better understand the role of these proteins in T. urticae, we have structurally and functionally characterized one paralog (tetur07g02040). It was demonstrated that this protein is indeed an intradiol ring-cleavage dioxygenase, as the enzyme is able to cleave catechol between two hydroxyl-groups using atmospheric dioxygen. The enzyme was characterized functionally and structurally. The active site of the T. urticae enzyme contains an Fe3+ cofactor that is coordinated by two histidine and two tyrosine residues, an arrangement that is similar to those observed in bacterial homologs. However, the active site is significantly more solvent exposed than in bacterial proteins. Moreover, the mite enzyme is monomeric, while almost all structurally characterized bacterial homologs form oligomeric assemblies. Tetur07g02040 is not only the first spider mite dioxygenase that has been characterized at the molecular level, but is also the first structurally characterized intradiol ring-cleavage dioxygenase originating from a eukaryote.Copyright © 2018 Elsevier Ltd. All rights reserved.

April 21, 2020

Detecting a long insertion variant in SAMD12 by SMRT sequencing: implications of long-read whole-genome sequencing for repeat expansion diseases.

Long-read sequencing technology is now capable of reading single-molecule DNA with an average read length of more than 10?kb, fully enabling the coverage of large structural variations (SVs). This advantage may pave the way for the detection of unprecedented SVs as well as repeat expansions. Pathogenic SVs of only known genes used to be selectively analyzed based on prior knowledge of target DNA sequence. The unbiased application of long-read whole-genome sequencing (WGS) for the detection of pathogenic SVs has just begun. Here, we apply PacBio SMRT sequencing in a Japanese family with benign adult familial myoclonus epilepsy (BAFME). Our SV selection of low-coverage WGS data (7×) narrowed down the candidates to only six SVs in a 7.16-Mb region of the BAFME1 locus and correctly determined an approximately 4.6-kb SAMD12 intronic repeat insertion, which is causal of BAFME1. These results indicate that long-read WGS is potentially useful for evaluating all of the known SVs in a genome and identifying new disease-causing SVs in combination with other genetic methods to resolve the genetic causes of currently unexplained diseases.

Asset Tag: Large genome

Long-read sequencing identified intronic repeat expansions in SAMD12 from Chinese pedigrees affected with familial cortical myoclonic tremor with epilepsy.

Long-read sequence capture of the haemoglobin gene clusters across codfish species.

The complete chloroplast genome sequence of watercress (Nasturtium officinale R. Br.): Genome organization, adaptive evolution and phylogenetic relationships in Cardamineae.

High Quality Draft Genome of Arogyapacha (Trichopus zeylanicus), an Important Medicinal Plant Endemic to Western Ghats of India.

De Novo Genome Sequence Assembly of Dwarf Coconut (Cocos nucifera L. ‘Catigan Green Dwarf’) Provides Insights into Genomic Variation Between Coconut Types and Related Palm Species.

Hybrid Genome Assembly of a Neotropical Mutualistic Ant.

Genome sequencing and CRISPR/Cas9 gene editing of an early flowering Mini-Citrus (Fortunella hindsii).

Medaka Population Genome Structure and Demographic History Described via Genotyping-by-Sequencing.

Multiple modes of convergent adaptation in the spread of glyphosate-resistant Amaranthus tuberculatus.

The major histocompatibility complex of Old World camelids: Class I and class I-related genes.

Mate Selection in Self-Compatible Wild Tobacco Results from Coordinated Variation in Homologous Self-Incompatibility Genes.

The complete mitochondrial genome of the tartar Sand Boa Eryx tataricus

The Genome Sequence of the Anthelmintic-Susceptible New Zealand Haemonchus contortus.

Structural and functional characterization of an intradiol ring-cleavage dioxygenase from the polyphagous spider mite herbivore Tetranychus urticae Koch.

Detecting a long insertion variant in SAMD12 by SMRT sequencing: implications of long-read whole-genome sequencing for repeat expansion diseases.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert