Menu
July 19, 2019

Radical remodeling of the Y chromosome in a recent radiation of malaria mosquitoes.

Y chromosomes control essential male functions in many species, including sex determination and fertility. However, because of obstacles posed by repeat-rich heterochromatin, knowledge of Y chromosome sequences is limited to a handful of model organisms, constraining our understanding of Y biology across the tree of life. Here, we leverage long single-molecule sequencing to determine the content and structure of the nonrecombining Y chromosome of the primary African malaria mosquito, Anopheles gambiae. We find that the An. gambiae Y consists almost entirely of a few massively amplified, tandemly arrayed repeats, some of which can recombine with similar repeats on the X chromosome. Sex-specific genome resequencing in a recent species radiation, the An. gambiae complex, revealed rapid sequence turnover within An. gambiae and among species. Exploiting 52 sex-specific An. gambiae RNA-Seq datasets representing all developmental stages, we identified a small repertoire of Y-linked genes that lack X gametologs and are not Y-linked in any other species except An. gambiae, with the notable exception of YG2, a candidate male-determining gene. YG2 is the only gene conserved and exclusive to the Y in all species examined, yet sequence similarity to YG2 is not detectable in the genome of a more distant mosquito relative, suggesting rapid evolution of Y chromosome genes in this highly dynamic genus of malaria vectors. The extensive characterization of the An. gambiae Y provides a long-awaited foundation for studying male mosquito biology, and will inform novel mosquito control strategies based on the manipulation of Y chromosomes.


July 19, 2019

PacBio SMRT assembly of a complex multi-replicon genome reveals chlorocatechol degradative operon in a region of genome plasticity.

We have sequenced a Burkholderia genome that contains multiple replicons and large repetitive elements that would make it inherently difficult to assemble by short read sequencing technologies. We illustrate how the integrated long read correction algorithms implemented through the PacBio Single Molecule Real-Time (SMRT) sequencing technology successfully provided a de novo assembly that is a reasonable estimate of both the gene content and genome organization without making any further modifications. This assembly is comparable to related organisms assembled by more labour intensive methods. Our assembled genome revealed regions of genome plasticity for further investigation, one of which harbours a chlorocatechol degradative operon highly homologous to those previously identified on globally ubiquitous plasmids. In an ideal world, this assembly would still require experimental validation to confirm gene order and copy number of repeated elements. However, we submit that particularly in instances where a polished genome is not the primary goal of the sequencing project, PacBio SMRT sequencing provides a financially viable option for generating a biologically relevant genome estimate that can be utilized by other researchers for comparative studies. Copyright © 2016. Published by Elsevier B.V.


July 19, 2019

Mitotic intragenic recombination: A mechanism of survival for several congenital disorders of glycosylation.

Congenital disorders of glycosylation (CDGs) are disorders of abnormal protein glycosylation that affect multiple organ systems. Because most CDGs have been described in only a few individuals, our understanding of the associated phenotypes and the mechanisms of individual survival are limited. In the process of studying two siblings, aged 6 and 11 years, with MOGS-CDG and biallelic MOGS (mannosyl-oligosaccharide glucosidase) mutations (GenBank: NM_006302.2; c.[65C>A; 329G>A] p.[Ala22Glu; Arg110His]; c.[370C>T] p.[Gln124(*)]), we noted that their survival was much longer than the previous report of MOGS-CDG, in a child who died at 74 days of age. Upon mutation analysis, we detected multiple MOGS genotypes including wild-type alleles in their cultured fibroblast and peripheral blood DNA. Further analysis of DNA from cultured fibroblasts of six individuals with compound heterozygous mutations of PMM2 (PMM2-CDG), MPI (MPI-CDG), ALG3 (ALG3-CDG), ALG12 (ALG12-CDG), DPAGT1 (DPAGT1-CDG), and ALG1 (ALG1-CDG) also identified multiple genotypes including wild-type alleles for each. Droplet digital PCR showed a ratio of nearly 1:1 wild-type to mutant alleles for most, but not all, mutations. This suggests that mitotic recombination contributes to the survival and the variable expressivity of individuals with compound heterozygous CDGs. This also provides an explanation for prior observations of a reduced frequency of homozygous mutations and might contribute to increased levels of residual enzyme activity in cultured fibroblasts of individuals with MPI- and PMM2-CDGs. Copyright © 2016 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.


July 19, 2019

Chromosomal-level assembly of the Asian seabass genome using long sequence reads and multi-layered scaffolding.

We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species’ native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics.


July 19, 2019

Bats may eat diurnal flies that rest on wind turbines

Bats are currently killed in large numbers at wind turbines worldwide, but the ultimate reason why this happens remains poorly understood. One hypothesis is that bats visit wind turbines to feed on insects exposed at the turbine towers. We used single molecule next generation DNA sequencing to identify stomach contents of 18 bats of four species (Pipistrellus pygmaeus, Nyctalus noctula, Eptesicus nilssonii and Vespertilio murinus) found dead under wind turbines in southern Sweden. Stomach contents were diverse but included typically diurnal flies, e.g. blow-flies (Calliphoridae), flesh-flies (Sarcophagidae) and houseflies (Muscidae) and also several flightless taxa. Such prey items were eaten by all bat species and at all wind turbine localities and it seems possible that they had been captured at or near the surface of the turbines at night. Using sticky traps, we documented an abundance of swarming (diurnal) ants (Myrmica spp.) and sometimes blow-flies and houseflies at the nacelle house. Near the base of the tower the catches were more diverse and corresponded better with the taxa found in the bat stomachs, including various diurnal flies. To evaluate if flies and other insects resting on the surface of a wind turbine are available to bats, we ensonified a house fly (Musca) on a smooth (plastic) surface with synthetic ultrasonic pulses of the frequencies used by the bat species that we had sampled. The experiment revealed potentially useful echoes, provided the attack angle was low and the frequency high (50–75 kHz). Hence resting flies and other arthropods can probably be detected by echolocating bats on the surface of a wind turbine. Our findings are consistent with published observations of the behavior of bats at wind turbines and may actually explain the function of some of these behaviors.


July 19, 2019

Next generation sequencing of Actinobacteria for the discovery of novel natural products.

Like many fields of the biosciences, actinomycete natural products research has been revolutionised by next-generation DNA sequencing (NGS). Hundreds of new genome sequences from actinobacteria are made public every year, many of them as a result of projects aimed at identifying new natural products and their biosynthetic pathways through genome mining. Advances in these technologies in the last five years have meant not only a reduction in the cost of whole genome sequencing, but also a substantial increase in the quality of the data, having moved from obtaining a draft genome sequence comprised of several hundred short contigs, sometimes of doubtful reliability, to the possibility of obtaining an almost complete and accurate chromosome sequence in a single contig, allowing a detailed study of gene clusters and the design of strategies for refactoring and full gene cluster synthesis. The impact that these technologies are having in the discovery and study of natural products from actinobacteria, including those from the marine environment, is only starting to be realised. In this review we provide a historical perspective of the field, analyse the strengths and limitations of the most relevant technologies, and share the insights acquired during our genome mining projects.


July 19, 2019

Accelerated cloning of a potato late blight-resistance gene using RenSeq and SMRT sequencing.

Global yields of potato and tomato crops have fallen owing to potato late blight disease, which is caused by Phytophthora infestans. Although most commercial potato varieties are susceptible to blight, many wild potato relatives show variation for resistance and are therefore a potential source of Resistance to P. infestans (Rpi) genes. Resistance breeding has exploited Rpi genes from closely related tuber-bearing potato relatives, but is laborious and slow. Here we report that the wild, diploid non-tuber-bearing Solanum americanum harbors multiple Rpi genes. We combine resistance (R) gene sequence capture (RenSeq) with single-molecule real-time (SMRT) sequencing (SMRT RenSeq) to clone Rpi-amr3i. This technology should enable de novo assembly of complete nucleotide-binding, leucine-rich repeat receptor (NLR) genes, their regulatory elements and complex multi-NLR loci from uncharacterized germplasm. SMRT RenSeq can be applied to rapidly clone multiple R genes for engineering pathogen-resistant crops.


July 19, 2019

Continuous evolution of Bacillus thuringiensis toxins overcomes insect resistance.

The Bacillus thuringiensis d-endotoxins (Bt toxins) are widely used insecticidal proteins in engineered crops that provide agricultural, economic, and environmental benefits. The development of insect resistance to Bt toxins endangers their long-term effectiveness. Here we have developed a phage-assisted continuous evolution selection that rapidly evolves high-affinity protein-protein interactions, and applied this system to evolve variants of the Bt toxin Cry1Ac that bind a cadherin-like receptor from the insect pest Trichoplusia ni (TnCAD) that is not natively bound by wild-type Cry1Ac. The resulting evolved Cry1Ac variants bind TnCAD with high affinity (dissociation constant Kd?=?11-41?nM), kill TnCAD-expressing insect cells that are not susceptible to wild-type Cry1Ac, and kill Cry1Ac-resistant T. ni insects up to 335-fold more potently than wild-type Cry1Ac. Our findings establish that the evolution of Bt toxins with novel insect cell receptor affinity can overcome insect Bt toxin resistance and confer lethality approaching that of the wild-type Bt toxin against non-resistant insects.


July 19, 2019

SMRT RenSeq protocol

R gene enrichment and Sequencing (RenSeq, Jupe et al. 2013) is a genome complexity reduction method which allows to enrich for nucleotide-binding, leucine reach repeat (NLR) type plant disease resistance genes prior to sequencing. RenSeq was established and successfully used with Illumina platforms (Jupe et al. 2013, Andolfo et al. 2014), however the repetitive nature of NLR genes hampered de novo assembly of this family. Here we describe a protocol which enables to prepare long enriched libraries that are suitable for Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. Reads Of Inserts (ROI) generated with this protocol are around 3-4 kb in length (longer than the average NLR sequence). These long reads are especially well suited for de novo assembly of whole NLR genes including their regulatory elements


July 19, 2019

Integrating DNA methylation and gene expression data in the development of the soybean-Bradyrhizobium N2-fixing symbiosis.

Very little is known about the role of epigenetics in the differentiation of a bacterium from the free-living to the symbiotic state. Here genome-wide analysis of DNA methylation changes between these states is described using the model of symbiosis between soybean and its root nodule-forming, nitrogen-fixing symbiont, Bradyrhizobium diazoefficiens. PacBio resequencing of the B. diazoefficiens genome from both states revealed 43,061 sites recognized by five motifs with the potential to be methylated genome-wide. Of those sites, 3276 changed methylation states in 2921 genes or 35.5% of all genes in the genome. Over 10% of the methylation changes occurred within the symbiosis island that comprises 7.4% of the genome. The CCTTGAG motif was methylated only during symbiosis with 1361 adenosines methylated among the 1700 possible sites. Another 89 genes within the symbiotic island and 768 genes throughout the genome were found to have methylation and significant expression changes during symbiotic development. Of those, nine known symbiosis genes involved in all phases of symbiotic development including early infection events, nodule development, and nitrogenase production. These associations between methylation and expression changes in many B. diazoefficiens genes suggest an important role of the epigenome in bacterial differentiation to the symbiotic state.


July 19, 2019

Large deletions at the SHOX locus in the pseudoautosomal region are associated with skeletal atavism in Shetland ponies.

Skeletal atavism in Shetland ponies is a heritable disorder characterized by abnormal growth of the ulna and fibula that extend the carpal and tarsal joints, respectively. This causes abnormal skeletal structure, impaired movements, and affected foals are usually euthanized. In order to identify the causal mutation we subjected six confirmed Swedish cases and a DNA pool consisting of 21 control individuals to whole genome resequencing. We screened for polymorphisms where the cases and the control pool were fixed for opposite alleles and observed this signature for only 25 SNPs, most of which were scattered on genome assembly unassigned scaffolds. Read depth analysis at these loci revealed homozygosity or compound heterozygosity for two partially overlapping large deletions in the pseudoautosomal region (PAR) of chromosome X/Y in cases but not in the control pool. One of these deletions removes the entire coding region of the SHOX gene and both deletions remove parts of the CRLF2 gene located downstream of SHOX. The horse reference assembly of the PAR is highly fragmented, and in order to characterize this region we sequenced bacterial artificial chromosome (BAC) clones by single-molecule real-time (SMRT) sequencing technology. This considerably improved the assembly and enabled size estimations of the two deletions to 160-180 kb and 60-80 kb, respectively. Complete association between the presence of these deletions and disease status was verified in eight other affected horses. The result of the present study is consistent with previous studies in humans showing crucial importance of SHOX for normal skeletal development. Copyright © 2016 Author et al.


July 19, 2019

Genomic changes following the reversal of a Y chromosome to an autosome in Drosophila pseudoobscura

Robertsonian translocations resulting in fusions between sex chromosomes and autosomes shape karyotype evolution by creating new sex chromosomes from autosomes. These translocations can also reverse sex chromosomes back into autosomes, which is especially intriguing given the dramatic differences between autosomes and sex chromosomes. To study the genomic events following a Y chromosome reversal, we investigated an autosome-Y translocation in Drosophila pseudoobscura. The ancestral Y chromosome fused to a small autosome (the dot chromosome) approximately 10–15 Mya. We used single molecule real-time sequencing reads to assemble the D. pseudoobscura dot chromosome, including this Y-to-dot translocation. We find that the intervening sequence between the ancestral Y and the rest of the dot chromosome is only ~78 Kb and is not repeat-dense, suggesting that the centromere now falls outside, rather than between, the fused chromosomes. The Y-to-dot region is 100 times smaller than the D. melanogaster Y chromosome, owing to changes in repeat landscape. However, we do not find a consistent reduction in intron sizes across the Y-to-dot region. Instead, deletions in intergenic regions and possibly a small ancestral Y chromosome size may explain the compact size of the Y-to-dot translocation.


July 19, 2019

AgIn: Measuring the landscape of CpG methylation of individual repetitive elements.

Determining the methylation state of regions with high copy numbers is challenging for second-generation sequencing, because the read length is insufficient to map reads uniquely, especially when repetitive regions are long and nearly identical to each other. Single-molecule real-time (SMRT) sequencing is a promising method for observing such regions, because it is not vulnerable to GC bias, it produces long read lengths, and its kinetic information is sensitive to DNA modifications.We propose a novel linear-time algorithm that combines the kinetic information for neighboring CpG sites and increases the confidence in identifying the methylation states of those sites. Using a practical read coverage of ~30-fold from an inbred strain medaka (Oryzias latipes), we observed that both the sensitivity and precision of our method on individual CpG sites were ~93.7%. We also observed a high correlation coefficient (R?=?0.884) between our method and bisulfite sequencing, and for 92.0% of CpG sites, methylation levels ranging over [0, 1] were in concordance within an acceptable difference 0.25. Using this method, we characterized the landscape of the methylation status of repetitive elements, such as LINEs, in the human genome, thereby revealing the strong correlation between CpG density and hypomethylation and detecting hypomethylation hot spots of LTRs and LINEs. We uncovered the methylation states for nearly identical active transposons, two novel LINE insertions of identity ~99% and length 6050 base pairs (bp) in the human genome, and 16 Tol2 elements of identity >99.8% and length 4682?bp in the medaka genome.AgIn (Aggregate on Intervals) is available at: https://github.com/hacone/AgIn CONTACT: ysuzuki@cb.k.u-tokyo.ac.jp, moris@cb.k.u-tokyo.ac.jp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. © The Author(s) 2016. Published by Oxford University Press.


July 19, 2019

Analysis of tandem gene copies in maize chromosomal regions reconstructed from long sequence reads.

Haplotype variation not only involves SNPs but also insertions and deletions, in particular gene copy number variations. However, comparisons of individual genomes have been difficult because traditional sequencing methods give too short reads to unambiguously reconstruct chromosomal regions containing repetitive DNA sequences. An example of such a case is the protein gene family in maize that acts as a sink for reduced nitrogen in the seed. Previously, 41-48 gene copies of the alpha zein gene family that spread over six loci spanning between 30- and 500-kb chromosomal regions have been described in two Iowa Stiff Stalk (SS) inbreds. Analyses of those regions were possible because of overlapping BAC clones, generated by an expensive and labor-intensive approach. Here we used single-molecule real-time (Pacific Biosciences) shotgun sequencing to assemble the six chromosomal regions from the Non-Stiff Stalk maize inbred W22 from a single DNA sequence dataset. To validate the reconstructed regions, we developed an optical map (BioNano genome map; BioNano Genomics) of W22 and found agreement between the two datasets. Using the sequences of full-length cDNAs from W22, we found that the error rate of PacBio sequencing seemed to be less than 0.1% after autocorrection and assembly. Expressed genes, some with premature stop codons, are interspersed with nonexpressed genes, giving rise to genotype-specific expression differences. Alignment of these regions with those from the previous analyzed regions of SS lines exhibits in part dramatic differences between these two heterotic groups.


July 19, 2019

Chromosome-level assembly of Arabidopsis thaliana Ler reveals the extent of translocation and inversion polymorphisms.

Resequencing or reference-based assemblies reveal large parts of the small-scale sequence variation. However, they typically fail to separate such local variation into colinear and rearranged variation, because they usually do not recover the complement of large-scale rearrangements, including transpositions and inversions. Besides the availability of hundreds of genomes of diverse Arabidopsis thaliana accessions, there is so far only one full-length assembled genome: the reference sequence. We have assembled 117 Mb of the A. thaliana Landsberg erecta (Ler) genome into five chromosome-equivalent sequences using a combination of short Illumina reads, long PacBio reads, and linkage information. Whole-genome comparison against the reference sequence revealed 564 transpositions and 47 inversions comprising ~3.6 Mb, in addition to 4.1 Mb of nonreference sequence, mostly originating from duplications. Although rearranged regions are not different in local divergence from colinear regions, they are drastically depleted for meiotic recombination in heterozygotes. Using a 1.2-Mb inversion as an example, we show that such rearrangement-mediated reduction of meiotic recombination can lead to genetically isolated haplotypes in the worldwide population of A. thaliana Moreover, we found 105 single-copy genes, which were only present in the reference sequence or the Ler assembly, and 334 single-copy orthologs, which showed an additional copy in only one of the genomes. To our knowledge, this work gives first insights into the degree and type of variation, which will be revealed once complete assemblies will replace resequencing or other reference-dependent methods.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.