Menu
April 21, 2020

TSD: A Computational Tool To Study the Complex Structural Variants Using PacBio Targeted Sequencing Data.

PacBio sequencing is a powerful approach to study DNA or RNA sequences in a longer scope. It is especially useful in exploring the complex structural variants generated by random integration or multiple rearrangement of endogenous or exogenous sequences. Here, we present a tool, TSD, for complex structural variant discovery using PacBio targeted sequencing data. It allows researchers to identify and visualize the genomic structures of targeted sequences by unlimited splitting, alignment and assembly of long PacBio reads. Application to the sequencing data derived from an HBV integrated human cell line(PLC/PRF/5) indicated that TSD could recover the full profile of HBV integration events, especially for the regions with the complex human-HBV genome integrations and multiple HBV rearrangements. Compared to other long read analysis tools, TSD showed a better performance for detecting complex genomic structural variants. TSD is publicly available at: https://github.com/menggf/tsd. Copyright © 2019 Meng et al.


April 21, 2020

Harnessing genomic information for livestock improvement.

The world demand for animal-based food products is anticipated to increase by 70% by 2050. Meeting this demand in a way that has a minimal impact on the environment will require the implementation of advanced technologies, and methods to improve the genetic quality of livestock are expected to play a large part. Over the past 10 years, genomic selection has been introduced in several major livestock species and has more than doubled genetic progress in some. However, additional improvements are required. Genomic information of increasing complexity (including genomic, epigenomic, transcriptomic and microbiome data), combined with technological advances for its cost-effective collection and use, will make a major contribution.


April 21, 2020

Mobilization of Pack-CACTA transposons in Arabidopsis suggests the mechanism of gene shuffling.

Pack-TYPE transposons are a unique class of potentially mobile non-autonomous elements that can capture, merge and relocate fragments of chromosomal DNA. It has been postulated that their activity accelerates the evolution of host genes. However, this important presumption is based only on the sequences of currently inactive Pack-TYPE transposons and the acquisition of chromosomal DNA has not been recorded in real time. Analysing the DNA copy number variation in hypomethylated Arabidopsis lines, we have now for the first time witnessed the mobilization of novel Pack-TYPE elements related to the CACTA transposon family, over several plant generations. Remarkably, these elements can insert into genes as closely spaced direct repeats and they frequently undergo incomplete excisions, resulting in the deletion of one of the end sequences. These properties suggest a mechanism of efficient acquisition of genic DNA residing between neighbouring Pack-TYPE transposons and its subsequent mobilization. Our work documents crucial steps in the formation of in vivo novel Pack-TYPE transposons, and thus the possible mechanism of gene shuffling mediated by this type of mobile element. © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020

The role of genomic structural variation in the genetic improvement of polyploid crops

Many of our major crop species are polyploids, containing more than one genome or set of chromosomes. Polyploid crops present unique challenges, including difficulties in genome assembly, in discriminating between multiple gene and sequence copies, and in genetic mapping, hindering use of genomic data for genetics and breeding. Polyploid genomes may also be more prone to containing structural variation, such as loss of gene copies or sequences (presence–absence variation) and the presence of genes or sequences in multiple copies (copy-number variation). Although the two main types of genomic structural variation commonly identified are presence–absence variation and copy-number variation, we propose that homeologous exchanges constitute a third major form of genomic structural variation in polyploids. Homeologous exchanges involve the replacement of one genomic segment by a similar copy from another genome or ancestrally duplicated region, and are known to be extremely common in polyploids. Detecting all kinds of genomic structural variation is challenging, but recent advances such as optical mapping and long-read sequencing offer potential strategies to help identify structural variants even in complex polyploid genomes. All three major types of genomic structural variation (presence–absence, copy-number, and homeologous exchange) are now known to influence phenotypes in crop plants, with examples of flowering time, frost tolerance, and adaptive and agronomic traits. In this review, we summarize the challenges of genome analysis in polyploid crops, describe the various types of genomic structural variation and the genomics technologies and data that can be used to detect them, and collate information produced to date related to the impact of genomic structural variation on crop phenotypes. We highlight the importance of genomic structural variation for the future genetic improvement of polyploid crops.


April 21, 2020

A Species-Wide Inventory of NLR Genes and Alleles in Arabidopsis thaliana.

Infectious disease is both a major force of selection in nature and a prime cause of yield loss in agriculture. In plants, disease resistance is often conferred by nucleotide-binding leucine-rich repeat (NLR) proteins, intracellular immune receptors that recognize pathogen proteins and their effects on the host. Consistent with extensive balancing and positive selection, NLRs are encoded by one of the most variable gene families in plants, but the true extent of intraspecific NLR diversity has been unclear. Here, we define a nearly complete species-wide pan-NLRome in Arabidopsis thaliana based on sequence enrichment and long-read sequencing. The pan-NLRome largely saturates with approximately 40 well-chosen wild strains, with half of the pan-NLRome being present in most accessions. We chart NLR architectural diversity, identify new architectures, and quantify selective forces that act on specific NLRs and NLR domains. Our study provides a blueprint for defining pan-NLRomes.Copyright © 2019 The Author(s). Published by Elsevier Inc. All rights reserved.


April 21, 2020

Genome of the Komodo dragon reveals adaptations in the cardiovascular and chemosensory systems of monitor lizards.

Monitor lizards are unique among ectothermic reptiles in that they have high aerobic capacity and distinctive cardiovascular physiology resembling that of endothermic mammals. Here, we sequence the genome of the Komodo dragon Varanus komodoensis, the largest extant monitor lizard, and generate a high-resolution de novo chromosome-assigned genome assembly for V. komodoensis using a hybrid approach of long-range sequencing and single-molecule optical mapping. Comparing the genome of V. komodoensis with those of related species, we find evidence of positive selection in pathways related to energy metabolism, cardiovascular homoeostasis, and haemostasis. We also show species-specific expansions of a chemoreceptor gene family related to pheromone and kairomone sensing in V. komodoensis and other lizard lineages. Together, these evolutionary signatures of adaptation reveal the genetic underpinnings of the unique Komodo dragon sensory and cardiovascular systems, and suggest that selective pressure altered haemostasis genes to help Komodo dragons evade the anticoagulant effects of their own saliva. The Komodo dragon genome is an important resource for understanding the biology of monitor lizards and reptiles worldwide.


April 21, 2020

A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences.We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover’s distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover’s distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours.The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.


April 21, 2020

Evolution and Diversification of Kiwifruit Mitogenomes through Extensive Whole-Genome Rearrangement and Mosaic Loss of Intergenic Sequences in a Highly Variable Region.

Angiosperm mitochondrial genomes (mitogenomes) are notable for their extreme diversity in both size and structure. However, our current understanding of this diversity is limited, and the underlying mechanism contributing to this diversity remains unclear. Here, we completely assembled and compared the mitogenomes of three kiwifruit (Actinidia) species, which represent an early divergent lineage in asterids. We found conserved gene content and fewer genomic repeats, particularly large repeats (>1?kb), in the three mitogenomes. However, sequence transfers such as intracellular events are variable and dynamic, in which both ancestral shared and recently species-specific events as well as complicated transfers of two plastid-derived sequences into the nucleus through the mitogenomic bridge were detected. We identified extensive whole-genome rearrangements among kiwifruit mitogenomes and found a highly variable V region in which fragmentation and frequent mosaic loss of intergenic sequences occurred, resulting in greatly interspecific variations. One example is the fragmentation of the V region into two regions, V1 and V2, giving rise to the two mitochondrial chromosomes of Actinidia chinensis. Finally, we compared the kiwifruit mitogenomes with those of other asterids to characterize their overall mitogenomic diversity, which identified frequent gain/loss of genes/introns across lineages. In addition to repeat-mediated recombination and import-driven hypothesis of genome size expansion reported in previous studies, our results highlight a pattern of dynamic structural variation in plant mitogenomes through global genomic rearrangements and species-specific fragmentation and mosaic loss of intergenic sequences in highly variable regions on the basis of a relatively large ancestral mitogenome. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


April 21, 2020

Massive Changes of Genome Size Driven by Expansions of Non-autonomous Transposable Elements.

In eukaryotes, genome size correlates little with the number of coding genes or the level of organismal complexity (C-value paradox). The underlying causes of variations in genome size, whether adaptive or neutral, remain unclear, although several biological traits often covary with it [1-5]. Rapid increases in genome size occur mainly through whole-genome duplications or bursts in the activity of transposable elements (TEs) [6]. The very small and compact genome of Oikopleura dioica, a tunicate of the larvacean class, lacks elements of most ancient families of animal retrotransposons [7, 8]. Here, we sequenced the genomes of six other larvaceans, all of which are larger than that of Oikopleura (up to 12 times) and which increase in size with greater body length. Although no evidence was found for whole-genome duplications within the group of species, the global amount of TEs strongly correlated with genome size. Compared to other metazoans, however, the TE diversity was reduced in all species, as observed previously in O. dioica, suggesting a common ancestor with a compacted genome. Strikingly, non-autonomous elements, particularly short interspersed nuclear elements (SINEs), massively contributed to genome size variation through species-specific independent amplifications, ranging from 3% in the smallest genome up to 49% in the largest. Variations in SINE abundance explain as much as 83% of interspecific genome size variation. These data support an indirect influence of autonomous TEs on genome size via their ability to mobilize non-autonomous elements. Copyright © 2019 Elsevier Ltd. All rights reserved.


April 21, 2020

Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome.

The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5?kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the 'genome in a bottle' (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15?megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.


April 21, 2020

Slow Delivery Immunization Enhances HIV Neutralizing Antibody and Germinal Center Responses via Modulation of Immunodominance.

Conventional immunization strategies will likely be insufficient for the development of a broadly neutralizing antibody (bnAb) vaccine for HIV or other difficult pathogens because of the immunological hurdles posed, including B cell immunodominance and germinal center (GC) quantity and quality. We found that two independent methods of slow delivery immunization of rhesus monkeys (RMs) resulted in more robust T follicular helper (TFH) cell responses and GC B cells with improved Env-binding, tracked by longitudinal fine needle aspirates. Improved GCs correlated with the development of >20-fold higher titers of autologous nAbs. Using a new RM genomic immunoglobulin locus reference, we identified differential IgV gene use between immunization modalities. Ab mapping demonstrated targeting of immunodominant non-neutralizing epitopes by conventional bolus-immunized animals, whereas slow delivery-immunized animals targeted a more diverse set of epitopes. Thus, alternative immunization strategies can enhance nAb development by altering GCs and modulating the immunodominance of non-neutralizing epitopes. Copyright © 2019 Elsevier Inc. All rights reserved.


April 21, 2020

Copy-number variants in clinical genome sequencing: deployment and interpretation for rare and undiagnosed disease.

Current diagnostic testing for genetic disorders involves serial use of specialized assays spanning multiple technologies. In principle, genome sequencing (GS) can detect all genomic pathogenic variant types on a single platform. Here we evaluate copy-number variant (CNV) calling as part of a clinically accredited GS test.We performed analytical validation of CNV calling on 17 reference samples, compared the sensitivity of GS-based variants with those from a clinical microarray, and set a bound on precision using orthogonal technologies. We developed a protocol for family-based analysis of GS-based CNV calls, and deployed this across a clinical cohort of 79 rare and undiagnosed cases.We found that CNV calls from GS are at least as sensitive as those from microarrays, while only creating a modest increase in the number of variants interpreted (~10 CNVs per case). We identified clinically significant CNVs in 15% of the first 79 cases analyzed, all of which were confirmed by an orthogonal approach. The pipeline also enabled discovery of a uniparental disomy (UPD) and a 50% mosaic trisomy 14. Directed analysis of select CNVs enabled breakpoint level resolution of genomic rearrangements and phasing of de novo CNVs.Robust identification of CNVs by GS is possible within a clinical testing environment.


April 21, 2020

Characterization and phylogenetic analysis of the complete chloroplast genome sequence of Costus viridis (Costaceae)

The first complete chloroplast genome of Costus viridis (Costaceae) was reported in the current study. The C. viridis genome was 168,966bp in length and comprised a pair of inverted repeat (IR) regions of 29,166bp each, a large single-copy (LSC) region of 92,189bp, and a small single-copy (SSC) region of 18,445bp. It encoded 133 genes, including 87 protein-coding genes (79 PCG species), 38 tRNA genes (28 tRNA species), and eight rRNA genes (four rRNA species). The overall AT content was 63.75%. Phylogenetic analysis showed that C. viridis was closely related to species Costus osae within the genus Costus in family Costaceae.


April 21, 2020

Substantial Heritable Variation in Recombination Rate on Multiple Scales in Honeybees and Bumblebees.

Meiotic recombination shuffles genetic variation and promotes correct segregation of chromosomes. Rates of recombination vary on several scales, both within genomes and between individuals, and this variation is affected by both genetic and environmental factors. Social insects have extremely high rates of recombination, although the evolutionary causes of this are not known. Here, we estimate rates of crossovers and gene conversions in 22 colonies of the honeybee, Apis mellifera, and 9 colonies of the bumblebee, Bombus terrestris, using direct sequencing of 299 haploid drone offspring. We confirm that both species have extremely elevated crossover rates, with higher rates measured in the highly eusocial honeybee than the primitively social bumblebee. There are also significant differences in recombination rate between subspecies of honeybee. There is substantial variation in genome-wide recombination rate between individuals of both A. mellifera and B. terrestris and the distribution of these rates overlap between species. A large proportion of interindividual variation in recombination rate is heritable, which indicates the presence of variation in trans-acting factors that influence recombination genome-wide. We infer that levels of crossover interference are significantly lower in honeybees compared to bumblebees, which may be one mechanism that contributes to higher recombination rates in honeybees. We also find a significant increase in recombination rate with distance from the centromere, mirrored by methylation differences. We detect a strong transmission bias due to GC-biased gene conversion associated with noncrossover gene conversions. Our results shed light on the mechanistic causes of extreme rates of recombination in social insects and the genetic architecture of recombination rate variation. Copyright © 2019 by the Genetics Society of America.


April 21, 2020

Mutation of a bHLH transcription factor allowed almond domestication.

Wild almond species accumulate the bitter and toxic cyanogenic diglucoside amygdalin. Almond domestication was enabled by the selection of genotypes harboring sweet kernels. We report the completion of the almond reference genome. Map-based cloning using an F1 population segregating for kernel taste led to the identification of a 46-kilobase gene cluster encoding five basic helix-loop-helix transcription factors, bHLH1 to bHLH5. Functional characterization demonstrated that bHLH2 controls transcription of the P450 monooxygenase-encoding genes PdCYP79D16 and PdCYP71AN24, which are involved in the amygdalin biosynthetic pathway. A nonsynonymous point mutation (Leu to Phe) in the dimerization domain of bHLH2 prevents transcription of the two cytochrome P450 genes, resulting in the sweet kernel trait. Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.