Menu
July 7, 2019

Filling in the gap of human chromosome 4: Single Molecule Real Time sequencing of macrosatellite repeats in the facioscapulohumeral muscular dystrophy locus.

A majority of facioscapulohumeral muscular dystrophy (FSHD) is caused by contraction of macrosatellite repeats called D4Z4 that are located in the subtelomeric region of human chromosome 4q35. Sequencing the FSHD locus has been technically challenging due to its long size and nearly identical nature of repeat elements. Here we report sequencing and partial assembly of a BAC clone carrying an entire FSHD locus by a single molecule real time (SMRT) sequencing technology which could produce long reads up to about 18 kb containing D4Z4 repeats. De novo assembly by Hierarchical Genome Assembly Process 1 (HGAP.1) yielded a contig of 41 kb containing all but a part of the most distal D4Z4 element. The validity of the sequence model was confirmed by an independent approach employing anchored multiple sequence alignment by Kalign using reads containing unique flanking sequences. Our data will provide a basis for further optimization of sequencing and assembly conditions of D4Z4.


July 7, 2019

Single-locus enrichment without amplification for sequencing and direct detection of epigenetic modifications.

A gene-level targeted enrichment method for direct detection of epigenetic modifications is described. The approach is demonstrated on the CGG-repeat region of the FMR1 gene, for which large repeat expansions, hitherto refractory to sequencing, are known to cause fragile X syndrome. In addition to achieving a single-locus enrichment of nearly 700,000-fold, the elimination of all amplification steps removes PCR-induced bias in the repeat count and preserves the native epigenetic modifications of the DNA. In conjunction with the single-molecule real-time sequencing approach, this enrichment method enables direct readout of the methylation status and the CGG repeat number of the FMR1 allele(s) for a clonally derived cell line. The current method avoids potential biases introduced through chemical modification and/or amplification methods for indirect detection of CpG methylation events.


July 7, 2019

A time- and cost-effective strategy to sequence mammalian Y Chromosomes: an application to the de novo assembly of gorilla Y.

The mammalian Y Chromosome sequence, critical for studying male fertility and dispersal, is enriched in repeats and palindromes, and thus, is the most difficult component of the genome to assemble. Previously, expensive and labor-intensive BAC-based techniques were used to sequence the Y for a handful of mammalian species. Here, we present a much faster and more affordable strategy for sequencing and assembling mammalian Y Chromosomes of sufficient quality for most comparative genomics analyses and for conservation genetics applications. The strategy combines flow sorting, short- and long-read genome and transcriptome sequencing, and droplet digital PCR with novel and existing computational methods. It can be used to reconstruct sex chromosomes in a heterogametic sex of any species. We applied our strategy to produce a draft of the gorilla Y sequence. The resulting assembly allowed us to refine gene content, evaluate copy number of ampliconic gene families, locate species-specific palindromes, examine the repetitive element content, and produce sequence alignments with human and chimpanzee Y Chromosomes. Our results inform the evolution of the hominine (human, chimpanzee, and gorilla) Y Chromosomes. Surprisingly, we found the gorilla Y Chromosome to be similar to the human Y Chromosome, but not to the chimpanzee Y Chromosome. Moreover, we have utilized the assembled gorilla Y Chromosome sequence to design genetic markers for studying the male-specific dispersal of this endangered species. © 2016 Tomaszkiewicz et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Comparative genomic analyses of the Moraxella catarrhalis serosensitive and seroresistant lineages demonstrate their independent evolution.

The bacterial species Moraxella catarrhalishas been hypothesized as being composed of two distinct lineages (referred to as the seroresistant [SR] and serosensitive [SS]) with separate evolutionary histories based on several molecular typing methods, whereas 16S ribotyping has suggested an additional split within the SS lineage. Previously, we characterized whole-genome sequences of 12 SR-lineage isolates, which revealed a relatively small supragenome when compared with other opportunistic nasopharyngeal pathogens, suggestive of a relatively short evolutionary history. Here, we performed whole-genome sequencing on 18 strains from both ribotypes of the SS lineage, an additional SR strain, as well as four previously identified highly divergent strains based on multilocus sequence typing analyses. All 35 strains were subjected to a battery of comparative genomic analyses which clearly show that there are three lineages-the SR, SS, and the divergent. The SR and SS lineages are closely related, but distinct from each other based on three different methods of comparison: Allelic differences observed among core genes; possession of lineage-specific sets of core and distributed genes; and by an alignment of concatenated core sequences irrespective of gene annotation. All these methods show that the SS lineage has much longer interstrain branches than the SR lineage indicating that this lineage has likely been evolving either longer or faster than the SR lineage. There is evidence of extensive horizontal gene transfer (HGT) within both of these lineages, and to a lesser degree between them. In particular, we identified very high rates of HGT between these two lineages for ß-lactamase genes. The four divergent strains aresui generis, being much more distantly related to both the SR and SS groups than these other two groups are to each other. Based on average nucleotide identities, gene content, GC content, and genome size, this group could be considered as a separate taxonomic group. The SR and SS lineages, although distinct, clearly form a single species based on multiple criteria including a large common core genome, average nucleotide identity values, GC content, and genome size. Although neither of these lineages arose from within the other based on phylogenetic analyses, the question of how and when these lineages split and then subsequently reunited in the human nasopharynx is explored. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

Third-generation sequencing and the future of genomics

Third-generation long-range DNA sequencing and mapping technologies are creating a renaissance in high-quality genome sequencing. Unlike second-generation sequencing, which produces short reads a few hundred base-pairs long, third-generation single-molecule technologies generate over 10,000 bp reads or map over 100,000 bp molecules. We analyze how increased read lengths can be used to address long-standing problems in de novo genome assembly, structural variation analysis and haplotype phasing.


July 7, 2019

Complete genome sequence of antibiotic and anticancer agent violacein producing Massilia sp. strain NR 4-1.

Massilia sp. NR 4-1 was a violacein producing strain newly isolated from topsoil under nutmeg tree, Torreya nucifera in Korean national monument Bijarim Forest. Violacein is a novel class of drug exhibiting anticancer and antibiotic activities originated from l-tryptophan. Here, we present the complete genome of Massilia sp. strain NR 4-1 of 6,361,416bp and total 5285 coding sequences (CDSs) including a complete violacein biosynthesis pathway, vioABCDE. The genome sequence of Massilia sp. NR 4-1 will provide stable and efficient biotechnological applications of violacein production. Copyright © 2016 Elsevier B.V. All rights reserved.


July 7, 2019

Single-molecule DNA hybridisation studied by using a modified DNA sequencer: a comparison with surface plasmon resonance data

Current methods for the determination of molecular interactions are widely used in the analytical sciences. To identify new methods, we investigated as a model system the hybridisation of a short 7 nt oligonucleotide labelled with, structurally, very similar cyanine dyes CY3 and DY-547, respectively, to a 34 nt oligonucleotide probe immobilised in a zero-mode waveguide (ZMW) nanostructure. Using a modified commercial off-the-shelf DNA sequencer, we established the principles to measure biomolecular interactions at the single-molecule level. Kinetic data were obtained from trains of fluorescence pulses, allowing the calculation of association and dissociation rate constants (k on, k off). For the 7mer labelled with the positively charged CY3 dye, k on and k off are ~3 larger and ~2 times smaller, respectively, compared with the oligonucleotide labelled with negatively charged DY-547 dye. The effect of neighbouring molecules lacking the 7nt binding sequence on single-molecule rate constants is small. The association rate constants is reduced by only 20–35%. Hybrid dissociation is not affected, since as a consequence of the experimental design, rebinding cannot take place. Results of single-molecule experiments were compared with data obtained from surface plasmon resonance (SPR) performed under comparable conditions. A good correlation for the association rate constants within a factor of 1.5 was found. Dissociation rate constants are smaller by a factor of 2–3 which we interpreted as a result of rebinding to neighbouring probes. Results of SPR measurements tend to systematically underestimate dissociation rate constants. The amount of this deviation depends on the association rate constant and the surface probe density. As a consequence, it is recommended to work at low probe densities to keep this effect small.


July 7, 2019

Refined Pichia pastoris reference genome sequence.

Strains of the species Komagataella phaffii are the most frequently used “Pichia pastoris” strains employed for recombinant protein production as well as studies on peroxisome biogenesis, autophagy and secretory pathway analyses. Genome sequencing of several different P. pastoris strains has provided the foundation for understanding these cellular functions in recent genomics, transcriptomics and proteomics experiments. This experimentation has identified mistakes, gaps and incorrectly annotated open reading frames in the previously published draft genome sequences. Here, a refined reference genome is presented, generated with genome and transcriptome sequencing data from multiple P. pastoris strains. Twelve major sequence gaps from 20 to 6000 base pairs were closed and 5111 out of 5256 putative open reading frames were manually curated and confirmed by RNA-seq and published LC-MS/MS data, including the addition of new open reading frames (ORFs) and a reduction in the number of spliced genes from 797 to 571. One chromosomal fragment of 76kbp between two previous gaps on chromosome 1 and another 134kbp fragment at the end of chromosome 4, as well as several shorter fragments needed re-orientation. In total more than 500 positions in the genome have been corrected. This reference genome is presented with new chromosomal numbering, positioning ribosomal repeats at the distal ends of the four chromosomes, and includes predicted chromosomal centromeres as well as the sequence of two linear cytoplasmic plasmids of 13.1 and 9.5kbp found in some strains of P. pastoris. Copyright © 2016. Published by Elsevier B.V.


July 7, 2019

Genome editing in human pluripotent stem cells: approaches, pitfalls, and solutions.

Human pluripotent stem cells (hPSCs) with knockout or mutant alleles can be generated using custom-engineered nucleases. Transcription activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 nucleases are the most commonly employed technologies for editing hPSC genomes. In this Protocol Review, we provide a brief overview of custom-engineered nucleases in the context of gene editing in hPSCs with a focus on the application of TALENs and CRISPR/Cas9. We will highlight the advantages and disadvantages of each method and discuss theoretical and technical considerations for experimental design. Copyright © 2016 Elsevier Inc. All rights reserved.


July 7, 2019

Single-molecule sequencing assists genome assembly improvement and structural variation inference.

Dear editor, The single-molecule real-time (SMRT) sequencing platform presented by Pacific Biosciences (PacBio) is regarded as a third-generation sequencing technology (Eid et al., 2009, Roberts et al., 2013). PacBio delivers long reads from several to tens of kilobases (kbs), which are ideal for filling unsequenced gaps due to unusual sequence contexts, such as high-GC content or repeat-rich regions (Bashir et al., 2012, Berlin et al., 2015, Chaisson et al., 2015). PacBio long reads are also favorable for detecting large DNA fragments harboring structural variations (SVs), such as inversions, translocations, duplications, and large insertions/deletions (indels) (Ritz et al., 2010, English et al., 2014). However, one drawback of PacBio is the high error rate of base calling for single pass coverage of the genome (Au et al., 2012, Koren et al., 2012). This drawback can be mitigated by increasing sequencing coverage to achieve high consensus accuracy, but the requirements may be prohibitive for the de novo assembly of large- or medium-size genomes using only PacBio when considering both budgetary and computational costs. Alternatively, PacBio may be used for assembly improvement of near-finished reference genomes, especially for filling gaps in which unsequenced bases are represented by the letter N (English et al., 2012). Here, we combined PacBio (~15x) with Illumina reads (~40x) to improve the genome assemblies of African wild (Oryza barthii) and cultivated rice (O. glaberrima), and to infer large SVs between O. barthii and O. glaberrima.


July 7, 2019

Evolutionary redesign of the Atlantic cod (Gadus morhua L.) Toll-like receptor repertoire by gene losses and expansions.

Genome sequencing of the teleost Atlantic cod demonstrated loss of the Major Histocompatibility Complex (MHC) class II, an extreme gene expansion of MHC class I and gene expansions and losses in the innate pattern recognition receptor (PRR) family of Toll-like receptors (TLR). In a comparative genomic setting, using an improved version of the genome, we characterize PRRs in Atlantic cod with emphasis on TLRs demonstrating the loss of TLR1/6, TLR2 and TLR5 and expansion of TLR7, TLR8, TLR9, TLR22 and TLR25. We find that Atlantic cod TLR expansions are strongly influenced by diversifying selection likely to increase the detectable ligand repertoire through neo- and subfunctionalization. Using RNAseq we find that Atlantic cod TLRs display likely tissue or developmental stage-specific expression patterns. In a broader perspective, a comprehensive vertebrate TLR phylogeny reveals that the Atlantic cod TLR repertoire is extreme with regards to losses and expansions compared to other teleosts. In addition we identify a substantial shift in TLR repertoires following the evolutionary transition from an aquatic vertebrate (fish) to a terrestrial (tetrapod) life style. Collectively, our findings provide new insight into the function and evolution of TLRs in Atlantic cod as well as the evolutionary history of vertebrate innate immunity.


July 7, 2019

Evolution of coreceptor utilization to escape CCR5 antagonist therapy.

The HIV-1 envelope interacts with coreceptors CCR5 and CXCR4 in a dynamic, multi-step process, its molecular details not clearly delineated. Use of CCR5 antagonists results in tropism shift and therapeutic failure. Here we describe a novel approach using full-length patient-derived gp160 quasispecies libraries cloned into HIV-1 molecular clones, their separation based on phenotypic tropism in vitro, and deep sequencing of the resultant variants for structure-function analyses. Analysis of functionally validated envelope sequences from patients who failed CCR5 antagonist therapy revealed determinants strongly associated with coreceptor specificity, especially at the gp120-gp41 and gp41-gp41 interaction surfaces that invite future research on the roles of subunit interaction and envelope trimer stability in coreceptor usage. This study identifies important structure-function relationships in HIV-1 envelope, and demonstrates proof of concept for a new integrated analysis method that facilitates laboratory discovery of resistant mutants to aid in development of other therapeutic agents. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.


July 7, 2019

Understanding the genetics of APOE and TOMM40 and role of mitochondrial structure and function in clinical pharmacology of Alzheimer’s disease.

The methodology of Genome-Wide Association Screening (GWAS) has been applied for more than a decade. Translation to clinical utility has been limited, especially in Alzheimer’s Disease (AD). It has become standard practice in the analyses of more than two dozen AD GWAS studies to exclude the apolipoprotein E (APOE) region because of its extraordinary statistical support, unique thus far in complex human diseases. New genes associated with AD are proposed frequently based on SNPs associated with odds ratio (OR) < 1.2. Most of these SNPs are not located within the associated gene exons or introns but are located variable distances away. Often pathologic hypotheses for these genes are presented, with little or no experimental support. By eliminating the analyses of the APOE-TOMM40 linkage disequilibrium region, the relationship and data of several genes that are co-located in that LD region have been largely ignored. Early negative interpretations limited the interest of understanding the genetic data derived from GWAS, particularly regarding the TOMM40 gene. This commentary describes the history and problem(s) in interpretation of the genetic interrogation of the "APOE" region and provides insight into a metabolic mitochondrial basis for the etiology of AD using both APOE and TOMM40 genetics. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.


July 7, 2019

Prediction of putative resistance islands in a carbapenem-resistant Acinetobacter baumannii global clone 2 clinical isolate.

We investigated the whole genome sequence (WGS) of a carbapenem-resistant Acinetobacter baumannii isolate belonging to the global clone 2 (GC2) and predicted resistance islands using a software tool.A. baumannii strain YU-R612 was isolated from the sputum of a 61-yr-old man with sepsis. The WGS of the YU-R612 strain was obtained by using the PacBio RS II Sequencing System (Pacific Biosciences Inc., USA). Antimicrobial resistance genes and resistance islands were analyzed by using ResFinder and Genomic Island Prediction software (GIPSy), respectively.The YU-R612 genome consisted of a circular chromosome (ca. 4,075 kb) and two plasmids (ca. 74 kb and 5 kb). Its sequence type (ST) under the Oxford scheme was ST191, consistent with assignment to GC2. ResFinder analysis showed that YU-R612 possessed the following resistance genes: four ß-lactamase genes bla(ADC-30), bla(OXA-66), bla(OXA-23), and bla(TEM-1); armA, aadA1, and aacA4 as aminoglycoside resistance-encoding genes; aac(6′)Ib-cr for fluoroquinolone resistance; msr(E) for macrolide, lincosamide, and streptogramin B resistance; catB8 for phenicol resistance; and sul1 for sulfonamide resistance. By GIPSy analysis, six putative resistant islands (PRIs) were determined on the YU-R612 chromosome. Among them, PRI1 possessed two copies of Tn2009 carrying bla(OXA-23), and PRI5 carried two copies of a class I integron carrying sul1 and armA genes.By prediction of resistance islands in the carbapenem-resistant A. baumannii YU-R612 GC2 strain isolated in Korea, PRIs were detected on the chromosome that possessed Tn2009 and class I integrons. The prediction of resistance islands using software tools was useful for analysis of the WGS.


July 7, 2019

Complete nucleotide sequence of pH11, an IncHI2 plasmid conferring multi-antibiotic resistance and multi-heavy metal resistance genes in a clinical Klebsiella pneumoniae isolate.

The complete 284,628bp sequence of pH11, an IncHI2 plasmid, was determined through single-molecule, real-time (SMRT) sequencing. Harbored by a clinical Klebsiella pneumoniae strain H11, and isolated in Beijing, this plasmid contains multiple antibiotic resistance genes, including catA2, aac(6′)-Ib, strB, strA, dfrA19, blaTEM-1, blaSHV-12, sul1, qacE delta 1, ereA, arr2, and aac3. The aac(6′)-Ib is carried by a class I integron. Plasmid pH11 also carries several genes associated with resistance to heavy metals, such as tellurium, mercury, cobalt, zinc, nickel, copper, lead and cadmium. This plasmid exhibits numerous characteristics, including HipBA and RelBE toxin-antitoxin systems, two major transfer (Tra) regions closely related to those of Salmonella enterica serovar plasmid pRH-R27, a type II restriction modification system (EcoRII R-M system), several methyltransferases and methylases and genes encoding Hha and StpA. These characteristics suggest that pH11 may adapt to various hosts and environments. Multiple insertion sequence elements, transposases, recombinases, resolvases and integrases are scattered throughout pH11. The presence of these genes may indicate that horizontal gene transfer occurs frequently in pH11 and thus may facilitate the dissemination of antimicrobial resistance determinants. Our data suggest that pH11 is a chimera gradually assembled through the integration of different horizontally acquired DNA segments via transposition or homologous recombination. Copyright © 2016 Elsevier Inc. All rights reserved.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.