Menu
July 7, 2019

Structural variation analysis with strobe reads.

Structural variation including deletions, duplications and rearrangements of DNA sequence are an important contributor to genome variation in many organisms. In human, many structural variants are found in complex and highly repetitive regions of the genome making their identification difficult. A new sequencing technology called strobe sequencing generates strobe reads containing multiple subreads from a single contiguous fragment of DNA. Strobe reads thus generalize the concept of paired reads, or mate pairs, that have been routinely used for structural variant detection. Strobe sequencing holds promise for unraveling complex variants that have been difficult to characterize with current sequencing technologies.We introduce an algorithm for identification of structural variants using strobe sequencing data. We consider strobe reads from a test genome that have multiple possible alignments to a reference genome due to sequencing errors and/or repetitive sequences in the reference. We formulate the combinatorial optimization problem of finding the minimum number of structural variants in the test genome that are consistent with these alignments. We solve this problem using an integer linear program. Using simulated strobe sequencing data, we show that our algorithm has better sensitivity and specificity than paired read approaches for structural variation identification.braphael@brown.edu


July 7, 2019

Computational solutions to large-scale data management and analysis.

Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist – such as cloud and heterogeneous computing – to successfully tackle our big data problems.


July 7, 2019

Selective aluminum passivation for targeted immobilization of single DNA polymerase molecules in zero-mode waveguide nanostructures.

Optical nanostructures have enabled the creation of subdiffraction detection volumes for single-molecule fluorescence microscopy. Their applicability is extended by the ability to place molecules in the confined observation volume without interfering with their biological function. Here, we demonstrate that processive DNA synthesis thousands of bases in length was carried out by individual DNA polymerase molecules immobilized in the observation volumes of zero-mode waveguides (ZMWs) in high-density arrays. Selective immobilization of polymerase to the fused silica floor of the ZMW was achieved by passivation of the metal cladding surface using polyphosphonate chemistry, producing enzyme density contrasts of glass over aluminum in excess of 400:1. Yields of single-molecule occupancies of approximately 30% were obtained for a range of ZMW diameters (70-100 nm). Results presented here support the application of immobilized single DNA polymerases in ZMW arrays for long-read-length DNA sequencing.


July 7, 2019

Long, processive enzymatic DNA synthesis using 100% dye-labeled terminal phosphate-linked nucleotides.

We demonstrate the efficient synthesis of DNA with complete replacement of the four deoxyribonucleoside triphosphate (dNTP) substrates with nucleotides carrying fluorescent labels. A different, spectrally separable fluorescent dye suitable for single molecule fluorescence detection was conjugated to each of the four dNTPs via linkage to the terminal phosphate. Using these modified nucleotides, DNA synthesis by phi 29 DNA polymerase was observed to be processive for products thousands of bases in length, with labeled nucleotide affinities and DNA polymerization rates approaching unmodified dNTP levels. Results presented here show the compatibility of these nucleotides for single-molecule, real-time DNA sequencing applications.


July 7, 2019

Current status of genome sequencing and its applications in aquaculture

Aquaculture is the fastest-growing food production sector in agriculture, with great potential to meet projected protein needs of human beings. Aquaculture is facing several challenges, including lack of a sufficient number of genetically improved species, lack of species-specific feeds, high mortality due to diseases and pollution of ecosystems. The rapid development of sequencing technologies has revolutionized biological sciences, and supplied necessary tools to tackle these challenges in aquaculture and thus ensure its sustainability and profitability. So far, draft genomes have been published in over 24 aquaculture species, and used to address important issues related to aquaculture. We briefly review the advances of next generation sequencing technologies, and summarize the status of whole genome sequencing and its general applications (i.e. establishing reference genomes and discovering DNA markers) and specific applications in tackling some important issues (e.g. breeding, diseases, sex determination & maturation) related to aquaculture. For sequencing a new genome, we recommend the use of 100–200 × short reads using Illumina and 50–60 × long reads with PacBio sequencing technologies. For identification of a large number of SNPs, resequencing pooled DNA samples from different populations is the most cost-effective way. We also discuss the challenges and future directions of whole genome sequencing in aquaculture.


July 7, 2019

A gapless genome sequence of the fungus Botrytis cinerea.

Following earlier incomplete and fragmented versions of a genome sequence for the grey mould Botrytis cinerea, we here report a gapless, near-finished genome sequence for B. cinerea strain B05.10. The assembly comprises 18 chromosomes and was confirmed by an optical map and a genetic map based on ~75 000 SNP markers. All chromosomes contain fully assembled centromeric regions, and 10 chromosomes have telomeres on both ends. The genetic map consisted of 4153 cM and comparison of genetic distances with the physical distances identified 40 recombination hotspots. The linkage map also identified two mutations, located in the previously described genes Bos1 and BcsdhB, that confer resistance to the fungicides boscalid and iprodione. The genome was predicted to encode 11 701 proteins. RNAseq data from >20 different samples were used to validate and improve gene models. Manual curation of chromosome 1 revealed interesting features, such as the occurrence of a dicistronic transcript and fully overlapping genes in opposite orientations, as well as many spliced antisense transcripts. Manual curation also revealed that UTRs of genes can be complex and long, with many UTRs exceeding lengths of 1 kb and possessing multiple introns. Community annotation is in progress. This article is protected by copyright. All rights reserved. © 2016 BSPP AND JOHN WILEY & SONS LTD.


July 7, 2019

The comparative landscape of duplications in Heliconius melpomene and Heliconius cydno.

Gene duplications can facilitate adaptation and may lead to interpopulation divergence, causing reproductive isolation. We used whole-genome resequencing data from 34 butterflies to detect duplications in two Heliconius species, Heliconius cydno and Heliconius melpomene. Taking advantage of three distinctive signals of duplication in short-read sequencing data, we identified 744 duplicated loci in H. cydno and H. melpomene and evaluated the accuracy of our approach using single-molecule sequencing. We have found that duplications overlap genes significantly less than expected at random in H. melpomene, consistent with the action of background selection against duplicates in functional regions of the genome. Duplicate loci that are highly differentiated between H. melpomene and H. cydno map to four different chromosomes. Four duplications were identified with a strong signal of divergent selection, including an odorant binding protein and another in close proximity with a known wing colour pattern locus that differs between the two species. Heredity advance online publication, 7 December 2016; doi:10.1038/hdy.2016.107.


July 7, 2019

Comparative mitogenomic analysis of three species of periwinkles: Littorina fabalis, L. obtusata and L. saxatilis.

The flat periwinkles, Littorina fabalis and L. obtusata, offer an interesting system for local adaptation and ecological speciation studies. In order to provide genomic resources for these species, we sequenced their mitogenomes together with that of the rough periwinkle L. saxatilis by means of next-generation sequencing technologies. The three mitogenomes present the typical repertoire of 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes and a putative control region. Although the latter could not be fully recovered in flat periwinkles using short-reads due to a highly repetitive fragment, in L. saxatilis this problem was overcome with additional long-reads and we were able to assemble the complete mitogenome. Both gene order and nucleotide composition are similar between the three species as well as compared to other Littorinimorpha. A large variance in divergence was observed across mitochondrial regions, with six- to ten-fold difference between the highest and the lowest divergence rates. Based on nucleotide changes on the whole molecule and assuming a molecular clock, L. fabalis and L. obtusata started to diverge around 0.8 Mya (0.4-1.1 Mya). The evolution of the mitochondrial protein-coding genes in the three Littorina species appears mainly influenced by purifying selection as revealed by phylogenetic tests based on dN/dS ratios that did not detect any evidence for positive selection, although some caution is required given the limited power of the dataset and the implemented approaches. Copyright © 2016 Elsevier B.V. All rights reserved.


July 7, 2019

Structure and evolution of the filaggrin gene repeated region in primates

The evolutionary dynamics of repeat sequences is quite complex, with some duplicates never having differentiated from each other. Two models can explain the complex evolutionary process for repeated genes—concerted and birth-and-death, of which the latter is driven by duplications maintained by selection. Copy number variations caused by random duplications and losses in repeat regions may modulate molecular pathways and therefore affect phenotypic characteristics in a population, resulting in individuals that are able to adapt to new environments. In this study, we investigated the filaggrin gene (FLG), which codes for filaggrin—an important component of the outer layers of mammalian skin—and contains tandem repeats that exhibit copy number variation between and within species. To examine which model best fits the evolutionary pathway for the complete tandem repeats within a single exon of FLG, we determined the repeat sequences in crab-eating macaque (Macaca fascicularis), orangutan (Pongo abelii), gorilla (Gorilla gorilla), and chimpanzee (Pan troglodytes) and compared these with the sequence in human (Homo sapiens).


July 7, 2019

Identification of small RNAs in extracellular vesicles from the commensal yeast Malassezia sympodialis.

Malassezia is the dominant fungus in the human skin mycobiome and is associated with common skin disorders including atopic eczema (AE)/dermatitis. Recently, it was found that Malassezia sympodialis secretes nanosized exosome-like vesicles, designated MalaEx, that carry allergens and can induce inflammatory cytokine responses. Extracellular vesicles from different cell-types including fungi have been found to deliver functional RNAs to recipient cells. In this study we assessed the presence of small RNAs in MalaEx and addressed if the levels of these RNAs differ when M. sympodialis is cultured at normal human skin pH versus the elevated pH present on the skin of patients with AE. The total number and the protein concentration of the released MalaEx harvested after 48?h culture did not differ significantly between the two pH conditions nor did the size of the vesicles. From small RNA sequence data, we identified a set of reads with well-defined start and stop positions, in a length range of 16 to 22 nucleotides consistently present in the MalaEx. The levels of small RNAs were not significantly differentially expressed between the two different pH conditions indicating that they are not influenced by the elevated pH level observed on the AE skin.


July 7, 2019

Evolutionary genomics of the cold-adapted diatom Fragilariopsis cylindrus.

The Southern Ocean houses a diverse and productive community of organisms. Unicellular eukaryotic diatoms are the main primary producers in this environment, where photosynthesis is limited by low concentrations of dissolved iron and large seasonal fluctuations in light, temperature and the extent of sea ice. How diatoms have adapted to this extreme environment is largely unknown. Here we present insights into the genome evolution of a cold-adapted diatom from the Southern Ocean, Fragilariopsis cylindrus, based on a comparison with temperate diatoms. We find that approximately 24.7 per cent of the diploid F. cylindrus genome consists of genetic loci with alleles that are highly divergent (15.1 megabases of the total genome size of 61.1 megabases). These divergent alleles were differentially expressed across environmental conditions, including darkness, low iron, freezing, elevated temperature and increased CO2. Alleles with the largest ratio of non-synonymous to synonymous nucleotide substitutions also show the most pronounced condition-dependent expression, suggesting a correlation between diversifying selection and allelic differentiation. Divergent alleles may be involved in adaptation to environmental fluctuations in the Southern Ocean.


July 7, 2019

Plasmodium malariae and P. ovale genomes provide insights into malaria parasite evolution.

Elucidation of the evolutionary history and interrelatedness of Plasmodium species that infect humans has been hampered by a lack of genetic information for three human-infective species: P. malariae and two P. ovale species (P. o. curtisi and P. o. wallikeri). These species are prevalent across most regions in which malaria is endemic and are often undetectable by light microscopy, rendering their study in human populations difficult. The exact evolutionary relationship of these species to the other human-infective species has been contested. Using a new reference genome for P. malariae and a manually curated draft P. o. curtisi genome, we are now able to accurately place these species within the Plasmodium phylogeny. Sequencing of a P. malariae relative that infects chimpanzees reveals similar signatures of selection in the P. malariae lineage to another Plasmodium lineage shown to be capable of colonization of both human and chimpanzee hosts. Molecular dating suggests that these host adaptations occurred over similar evolutionary timescales. In addition to the core genome that is conserved between species, differences in gene content can be linked to their specific biology. The genome suggests that P. malariae expresses a family of heterodimeric proteins on its surface that have structural similarities to a protein crucial for invasion of red blood cells. The data presented here provide insight into the evolution of the Plasmodium genus as a whole.


July 7, 2019

De novo hybrid assembly of the rubber tree genome reveals evidence of paleotetraploidy in Hevea species.

Para rubber tree (Hevea brasiliensis) is an important economic species as it is the sole commercial producer of high-quality natural rubber. Here, we report a de novo hybrid assembly of BPM24 accession, which exhibits resistance to major fungal pathogens in Southeast Asia. Deep-coverage 454/Illumina short-read and Pacific Biosciences (PacBio) long-read sequence data were acquired to generate a preliminary draft, which was subsequently scaffolded using a long-range “Chicago” technique to obtain a final assembly of 1.26?Gb (N50?=?96.8?kb). The assembled genome contains 69.2% repetitive sequences and has a GC content of 34.31%. Using a high-density SNP-based genetic map, we were able to anchor 28.9% of the genome assembly (363?Mb) associated with over two thirds of the predicted protein-coding genes into rubber tree’s 18 linkage groups. These genetically anchored sequences allowed comparative analyses of the intragenomic homeologous synteny, providing the first concrete evidence to demonstrate the presence of paleotetraploidy in Hevea species. Additionally, the degree of macrosynteny conservation observed between rubber tree and cassava strongly supports the hypothesis that the paleotetraploidization event took place prior to the divergence of the Hevea and Manihot species.


July 7, 2019

The evolution and population diversity of human-specific segmental duplications

Segmental duplications contribute to human evolution, adaptation and genomic instability but are often poorly characterized. We investigate the evolution, genetic variation and coding potential of human-specific segmental duplications (HSDs). We identify 218 HSDs based on analysis of 322 deeply sequenced archaic and contemporary hominid genomes. We sequence 550 human and nonhuman primate genomic clones to reconstruct the evolution of the largest, most complex regions with protein-coding potential (N?=?80 genes from 33 gene families). We show that HSDs are non-randomly organized, associate preferentially with ancestral ape duplications termed ‘core duplicons’ and evolved primarily in an interspersed inverted orientation. In addition to Homo sapiens-specific gene expansions (such as TCAF1/TCAF2), we highlight ten gene families (for example, ARHGAP11B and SRGAP2C) where copy number never returns to the ancestral state, there is evidence of mRNA splicing and no common gene-disruptive mutations are observed in the general population. Such duplicates are candidates for the evolution of human-specific adaptive traits.


July 7, 2019

First complete genome sequence of Marinilactibacillus piezotolerans strain 15R, a marine lactobacillus isolated from coal-bearing sediment 2.0 kilometers below the seafloor, determined by PacBio single-molecule real-time technology.

Marinilactibacillus piezotolerans strain 15R is a facultatively anaerobic heterotrophic lactobacillus isolated from deep marine subsurface sediment nearly 2 km below the seafloor in the northwestern Pacific. We report here the first whole-genome sequence of strain 15R. The identified genome sequence has 2,767,908 bp, 35.4% G+C content, and predicted 2,552 candidate protein-coding sequences, with no identified plasmids. Copyright © 2017 Wei et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.