Menu
July 7, 2019

Genome sequencing brought Gossypium biology research into a new era.

The first sequenced diploid cotton genome was published in 2012 by the group led by the Institute of Cotton Research, Chinese Academy of Agricultural Sciences. Cotton genomics research subsequently entered a period of rapid development. The accumulating data have provided new insights into the evolution and domestication of cotton, the development of important agronomic traits, and strategies for improving cotton quality and production.


July 7, 2019

Bi-level error correction for PacBio long reads.

The latest sequencing technologies such as the Pacific Biosciences (PacBio) and Oxford Nanopore machines can generate long reads at the length of thousands of nucleic bases which is much longer than the reads at the length of hundreds generated by Illumina machines. However, these long reads are prone to much higher error rates, for example 15%, making downstream analysis and applications very difficult. Error correction is a process to improve the quality of sequencing data. Hybrid correction strategies have been recently proposed to combine Illumina reads of low error rates to fix sequencing errors in the noisy long reads with good performance. In this paper, we propose a new method named Bicolor, a bi-level framework of hybrid error correction for further improving the quality of PacBio long reads. At the first level, our method uses a de Bruijn graph-based error correction idea to search paths in pairs of solid -mers iteratively with an increasing length of -mer. At the second level, we combine the processed results under different parameters from the first level. In particular, a multiple sequence alignment algorithm is used to align those similar long reads, followed by a voting algorithm which determines the final base at each position of the reads. We compare the superior performance of Bicolor with three state-of-the-art methods on three real data sets. Results demonstrate that Bicolor always achieves the highest identity ratio. Bicolor also achieves a higher alignment ratio () and a higher number of aligned reads than the current methods on two data sets. On the third data set, our method is closely competitive to the current methods in terms of number of aligned reads and genome coverage. The C++ source codes of our algorithm are freely available at https://github.com/yuansliu/Bicolor.


July 7, 2019

A blaOXA-181-harbouring multi-resistant ST147 Klebsiella pneumoniae isolate from Pakistan that represent an intermediate stage towards pan-drug resistance.

Carbapenem resistant Klebsiella pneumoniae (CR-KP) infections are an ever-increasing global issue, especially in the Indian subcontinent. Here we report genetic insight into a blaOXA-181 harbouring Klebsiella pneumoniae, belonging to the pandemic lineage ST147, that represents an intermediate stage towards pan-drug resistance. The CR-KP isolate DA48896 was isolated from a patient from Pakistan and was susceptible only to tigecycline and colistin. It harboured blaOXA-181 and was assigned to sequence type ST147. Analysis from whole genome sequencing revealed a very high sequence similarity to the previously sequenced pan-resistant K. pneumoniae isolate MS6671 from the United Arab Emirates. The two isolates are very closely related with only 46 chromosomal nucleotide differences, 14 indels and differences in plasmid content. Both carry a substantial number of plasmid-borne and chromosomally encoded resistance determinants. Interestingly, the two differences in susceptibility between the isolates could be attributed to DA48896 lacking an insertion of blaOXA-181 into the mgrB gene that results in colistin resistance in MS6671 and SNPs affecting AcrAB efflux pump expression likely to result in tigecycline resistance. These differences between the otherwise very similar isolates indicate that strong selection has occurred for resistance towards these last-resort drugs and illustrates the trajectory of resistance evolution of OXA-181-producing versions of the ST147 international risk clone.


July 7, 2019

Complete genome sequence and comparative genomics of the golden pompano (Trachinotus ovatus) pathogen, Vibrio harveyistrain QT520.

Vibrio harveyi is a Gram-negative, halophilic bacterium that is an opportunistic pathogen of commercially farmed marine vertebrate species. To understand the pathogenicity of this species, the genome of V. harveyi QT520 was analyzed and compared to that of other strains. The results showed the genome of QT520 has two unique circular chromosomes and three endogenous plasmids, totaling 6,070,846 bp with a 45% GC content, 5,701 predicted ORFs, 134 tRNAs and 37 rRNAs. Common virulence factors, including ACF, IlpA, OmpU, Flagellin, Cya, Hemolysin and MARTX, were detected in the genome, which are likely responsible for the virulence of QT520. The results of genomes comparisons with strains ATCC 33843 (392 (MAV)) and ATCC 43516 showed that greater numbers genes associated with types I, II, III, IV and VI secretion systems were detected in QT520 than in other strains, suggesting that QT520 is a highly virulent strain. In addition, three plasmids were only observed in the complete genome sequence of strain QT520. In plasmid p1 of QT520, specific virulence factors (cyaB, hlyB and rtxA) were identified, suggesting that the pathogenicity of this strain is plasmid-associated. Phylogenetic analysis of 12 complete Vibrio sp. genomes using ANI values, core genes and MLST revealed that QT520 was most closely related to ATCC 33843 (392 (MAV)) and ATCC 43516, suggesting that QT520 belongs to the species V. harveyi. This report is the first to describe the complete genome sequence of a V. harveyi strain isolated from an outbreak in a fish species in China. In addition, to the best of our knowledge, this report is the first to compare the V. harveyi genomes of several strains. The results of this study will expand our understanding of the genome, genetic characteristics, and virulence factors of V. harveyi, setting the stage for studies of pathogenesis, diagnostics, and disease prevention.


July 7, 2019

Genomic characterization of a local epidemic Pseudomonas aeruginosa reveals specific features of the widespread clone ST395.

Pseudomonas aeruginosa is a ubiquitous opportunistic pathogen with several clones being frequently associated with outbreaks in hospital settings. ST395 is among these so-called ‘international’ clones. We aimed here to define the biological features that could have helped the implantation and spread of the clone ST395 in hospital settings. The complete genome of a multidrug resistant index isolate (DHS01) of a large hospital outbreak was analysed. We identified DHS01-specific genetic elements, among which were identified those shared with a panel of six independent ST395 isolates responsible for outbreaks in other hospitals. DHS01 has the fifth largest chromosome of the species (7.1 Mbp), with most of its 1555 accessory genes borne by either genomic islands (GIs,n=48) or integrative and conjugative elements (ICEs,n=5). DHS01 is multidrug resistant mostly due to chromosomal mutations. It displayed signatures of adaptation to chronic infection in part due to the loss of a 131 kbp chromosomal fragment. Four GIs were specific to the clone ST395 and contained genes involved in metabolism (GI-4), in virulence (GI-6) and in resistance to copper (GI-7). GI-7 harboured an array of six copper transporters and was shared with non-pathogenicPseudomonassp. retrieved from copper-contaminated environments. Copper resistance was confirmed phenotypically in all other ST395 isolates and possibly accounted for the spreading capability of the clone in hospital outbreaks, where water networks have been incriminated. This suggests that genes transferred from copper-polluted environments may have favoured the implantation and spread of the international cloneP. aeruginosaST395 in hospital settings.


July 7, 2019

High-quality draft genome sequence of Streptomyces agglomeratus 5-1-8 with strong anti-MRSA ability, isolated from the frozen soil of Tibet in China

Streptomyces agglomeratus 5-1-8 with strong anti methicillin-resistant Staphylococcus aureus (MRSA) ability, isolated from the frozen soil of Tibet in China, has a strong ability to kill the multi-drugs-resistant MRSA. To identify the second-ary metabolism ability of this strain, we describe here the phenotypic characteristics of this strain, along with its high-quality draft genome sequence, its annotation, and analysis. The 7.1M draft genome encodes 6,284 putative open reading frames (ORFs), of which 4,416 ORFs were assigned with clusters of orthologous genes (COG) categories. Also, 65 tRNA genes and 24 rRNA operons were identified. The genome contains 12 gene clusters involved in antibiotics production and 1 gene cluster involved in anticancer-compounds production; 4 gene clusters belong to polyketides and nonribosomal peptides, 1 gene cluster belong to the butyrolactone, 4 gene clusters belong to the bacteriocin or lantipeptide, and 3 gene clusters belong to the others. This genome-sequence data will facilitate efforts to probe the potential of new antibiotics to kill multi-drugs-resistant MRSA.


July 7, 2019

On the importance of homology in the age of phylogenomics

Homology is perhaps the most central concept of phylogenetic biology. Molecular systematists have traditionally paid due attention to the homology statements that are implied by their alignments of orthologous sequences, but some authors have suggested that manual gene-by-gene curation is not sustainable in the phylogenomics era. Here, we show that there are multiple ways to efficiently screen for and detect homology errors in phylogenomic data sets. Application of these screening approaches to two phylogenomic data sets, one for birds and another for mammals, shows that these data are replete with homology errors including alignments of different exons to each other, alignments of exons to introns, and alignments of paralogues to each other. The extent of these homology errors weakens the conclusions of studies based on these data sets. Despite advances in automated phylogenomic pipelines, we contend that much of the long, difficult, and sometimes tedious work of systematics is still required to guard against pervasive homology errors. This conclusion is underscored by recent studies that show that just a few outlier genes can impact phylogenetic results at short, tightly spaced internodes that are deep in the Tree of Life. The view that widespread DNA sequence alignment errors are not a major concern for rigorous systematic research is not tenable. If a primary goal of phylogenomics is to resolve the most challenging phylogenetic problems with the abundant data that are now available, researchers must employ effective procedures to screen for and correct homology errors prior to performing downstream phylogenetic analyses.


July 7, 2019

COSINE: non-seeding method for mapping long noisy sequences.

Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3-4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

A feast of malaria parasite genomes.

The Plasmodium genus has evolved over time and across hosts, complexifying our understanding of malaria. In a recent Nature paper, Rutledge et al. (2017) describe the genome sequences of three major human malaria parasite species, providing insight into Plasmodium evolution and raising the question of how many species there are. Copyright © 2017 Elsevier Inc. All rights reserved.


July 7, 2019

A recurrence-based approach for validating structural variation using long-read sequencing technology.

Although numerous algorithms have been developed to identify structural variations (SVs) in genomic sequences, there is a dearth of approaches that can be used to evaluate their results. This is significant as the accurate identification of structural variation is still an outstanding but important problem in genomics. The emergence of new sequencing technologies that generate longer sequence reads can, in theory, provide direct evidence for all types of SVs regardless of the length of the region through which it spans. However, current efforts to use these data in this manner require the use of large computational resources to assemble these sequences as well as visual inspection of each region. Here we present VaPoR, a highly efficient algorithm that autonomously validates large SV sets using long-read sequencing data. We assessed the performance of VaPoR on SVs in both simulated and real genomes and report a high-fidelity rate for overall accuracy across different levels of sequence depths. We show that VaPoR can interrogate a much larger range of SVs while still matching existing methods in terms of false positive validations and providing additional features considering breakpoint precision and predicted genotype. We further show that VaPoR can run quickly and efficiency without requiring a large processing or assembly pipeline. VaPoR provides a long read-based validation approach for genomic SVs that requires relatively low read depth and computing resources and thus will provide utility with targeted or low-pass sequencing coverage for accurate SV assessment. The VaPoR Software is available at: https://github.com/mills-lab/vapor.© The Authors 2017. Published by Oxford University Press.


July 7, 2019

De novo design and synthesis of a 30-cistron translation-factor module.

Two of the many goals of synthetic biology are synthesizing large biochemical systems and simplifying their assembly. While several genes have been assembled together by modular idempotent cloning, it is unclear if such simplified strategies scale to very large constructs for expression and purification of whole pathways. Here we synthesize from oligodeoxyribonucleotides a completely de-novo-designed, 58-kb multigene DNA. This BioBrick plasmid insert encodes 30 of the 31 translation factors of the PURE translation system, each His-tagged and in separate transcription cistrons. Dividing the insert between three high-copy expression plasmids enables the bulk purification of the aminoacyl-tRNA synthetases and translation factors necessary for affordable, scalable reconstitution of an in vitro transcription and translation system, PURE 3.0.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Unlocking the biological potential of Euglena gracilis: evolution, cell biology and significance to parasitism

Photosynthetic euglenids are major components of aquatic ecosystems and relatives of trypanosomes. Euglena gracilis has considerable biotechnological potential and great adaptability, but exploitation remains hampered by the absence of a comprehensive gene catalogue. We address this by genome, RNA and protein sequencing: the E. gracilis genome is >2Gb, with 36,526 predicted proteins. Large lineage-specific paralog families are present, with evidence for flexibility in environmental monitoring, divergent mechanisms for metabolic control, and novel solutions for adaptation to extreme environments. Contributions from photosynthetic eukaryotes to the nuclear genome, consistent with the shopping bag model are found, together with transitions between kinetoplastid and canonical systems. Control of protein expression is almost exclusively post-transcriptional. These data are a major advance in understanding the nuclear genomes of euglenids and provide a platform for investigating the contributions of E. gracilis and its relatives to the biosphere.


July 7, 2019

Genomic clues to the parental origin of the wild flowering cherry Prunus yedoensis var. nudiflora (Rosaceae)

Prunus yedoensis Matsumura is one of the popular ornamental flowering cherry trees native to northeastern Asia, and its wild populations have only been found on Jeju Island, Korea. Previous studies suggested that wild P. yedoensis (P. yedoensis var. nudiflora) is a hybrid species; however, there is no solid evidence on its exact parental origin and genomic organization. In this study, we developed a total of 38 nuclear gene-based DNA markers that can be universally amplifiable in the Prunus species using 586 Prunus Conserved Orthologous Gene Set (Prunus COS). Using the Prunus COS markers, we investigated the genetic structure of wild P. yedoensis populations and evaluated the putative parental species of wild P. yedoensis. Population structure and phylogenetic analysis of 73 wild P. yedoensis accessions and 54 accessions of other Prunus species revealed that the wild P. yedoensis on Jeju Island is a natural homoploid hybrid. Sequence-level comparison of Prunus COS markers between species suggested that wild P. yedoensis might originate from a cross between maternal P. pendula f. ascendens and paternal P. jamasakura. Moreover, approximately 81% of the wild P. yedoensis accessions examined were likely F1 hybrids, whereas the remaining 19% were backcross hybrids resulting from additional asymmetric introgression of parental genotypes. These findings suggest that complex hybridization of the Prunus species on Jeju Island can produce a range of variable hybrid offspring. Overall, this study makes a significant contribution to address issues of the origin, nomenclature, and genetic relationship of ornamental P. yedoensis.


July 7, 2019

Genome sequence-based marker development and genotyping in potato

Potato (Solanum tuberosum L.) is one of the world’s most economically important food crops and holds major significance for future food security. Despite its importance, the study of potato genetics and breeding has lagged behind mainly due to its polyploid genome and high levels of heterozygosity. Conventional marker and genotyping approaches have been helpful in progressing potato genetic research but have also had limitations in exploiting the outcome from these studies for gene discovery and applied research applications. The sequencing of the potato genome, followed by advancements in marker and genotyping technologies, has brought a step change in the way potato genetic studies are conducted. Potato is now amenable to modern sequence-based marker and genotyping methods with their increased ability to put thousands of markers on any population of interest without a priori knowledge. This has increased the precision and resolution of genetic studies previously not feasible in potato. A diverse range of fixed and flexible genotyping platforms, for a wide variety of research and breeding applications, are now available. Concerted research efforts are now needed to screen the available genetic diversity for this important crop to identify novel and beneficial trait alleles in order to enable efficient and precise introgression breeding permitting breeding of climate smart, and resilient, potato cultivars. This chapter provides an overview of sequence-based marker development and genotyping methods along with their implications for potato research and breeding in the post-genomics era.


July 7, 2019

The state of whole-genome sequencing

Over the last decade, a technological paradigm shift has slashed the cost of DNA sequencing by over five orders of magnitude. Today, the cost of sequencing a human genome is a few thousand dollars, and it continues to fall. Here, we review the most cost-effective platforms for whole-genome sequencing (WGS) as well as emerging technologies that may displace or complement these. We also discuss the practical challenges of generating and analyzing WGS data, and how WGS has unlocked new strategies for discovering genes and variants underlying both rare and common human diseases.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.