July 7, 2019  |  

Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences.

Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool-Genome Puzzle Master (GPM)-that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules.With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to ‘group,’ ‘merge,’ ‘order and orient’ sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user’s total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory.The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS CONTACTS: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.


July 7, 2019  |  

Third-generation sequencing and analysis of four complete pig liver esterase gene sequences in clones identified by screening BAC library.

Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing.After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis.Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression.This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes provides the necessary foundation for investigation of the genetic structure, function, and regulatory mechanisms of the PLE gene family.


July 7, 2019  |  

BAC-pool sequencing and analysis confirms growth-associated QTLs in the Asian seabass genome.

The Asian seabass is an important marine food fish that has been cultured for several decades in Asia Pacific. However, the lack of a high quality reference genome has hampered efforts to improve its selective breeding. A 3D BAC pool set generated in this study was screened using 22 SSR markers located on linkage group 2 which contains a growth-related QTL region. Seventy-two clones corresponding to 22 FPC contigs were sequenced by Illumina MiSeq technology. We co-assembled the MiSeq-derived scaffolds from each FPC contig with error-corrected PacBio reads, resulting in 187 sequences covering 9.7?Mb. Eleven genes annotated within this region were found to be potentially associated with growth and their tissue-specific expression was investigated. Correlation analysis demonstrated that SNPs in ctsb, skp1 and ppp2ca can be potentially used as markers for selecting fast-growing fingerlings. Conserved syntenies between seabass LG2 and five other teleosts were identified. This study i) provided a 10?Mb targeted genome assembly; ii) demonstrated NGS of BAC pools as a potential approach for mining candidates underlying QTLs of this species; iii) detected eleven genes potentially responsible for growth in the QTL region; and iv) identified useful SNP markers for selective breeding programs of Asian seabass.


July 7, 2019  |  

Interchromosomal core duplicons drive both evolutionary instability and disease susceptibility of the Chromosome 8p23.1 region.

Recurrent rearrangements of Chromosome 8p23.1 are associated with congenital heart defects and developmental delay. The complexity of this region has led to inconsistencies in the current reference assembly, confounding studies of genetic variation. Using comparative sequence-based approaches, we generated a high-quality 6.3-Mbp alternate reference assembly of an inverted Chromosome 8p23.1 haplotype. Comparison with nonhuman primates reveals a 746-kbp duplicative transposition and two separate inversion events that arose in the last million years of human evolution. The breakpoints associated with these rearrangements map to an ape-specific interchromosomal core duplicon that clusters at sites of evolutionary inversion (P = 7.8 × 10(-5)). Refinement of microdeletion breakpoints identifies a subgroup of patients that map to the same interchromosomal core involved in the evolutionary formation of the duplication blocks. Our results define a higher-order genomic instability element that has shaped the structure of specific chromosomes during primate evolution contributing to rearrangements associated with inversion and disease.© 2016 Mohajeri et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019  |  

Chimeras link to tandem repeats and transposable elements in tetraploid hybrid fish

Abstract The formation of the allotetraploid hybrid lineage (4nAT) encompasses both distant hybridization and polyploidization processes. The allotetraploid offspring have two sets of sub-genomes inherited from both parental species and therefore it is important to explore its genetic structure. Herein, we construct a bacterial artificial chromosome library of allotetraploids, and then sequence and analyze the full-length sequences of 19 bacterial artificial chromosomes. Sixty-eight DNA chimeras are identified, which are divided into four models according to the distribution of the genomic DNA derived from the parents. Among the 68 genetic chimeras, 44 (64.71%) are linked to tandem repeats (TRs) and 23 (33.82%) are linked to transposable elements (TEs). The chimeras linked to TRs are related to slipped-strand mispairing and double-strand break repair while the chimeras linked to TEs are benefit from the intervention of recombinases. In addition, TRs and TEs are linked not only with the recombinations, but also with the insertions/deletions of DNA segments. We conclude that DNA chimeras accompanied by TRs and TEs coordinate a balance between the sub-genomes derived from the parents which reduces the genomic shock effects and favors the evolutionary and adaptive capacity of the allotetraploidization. It is the first report on the relationship between formation of the DNA chimeras and TRs and TEs in the polyploid animals.


July 7, 2019  |  

New high copy tandem repeat in the content of the chicken W chromosome.

The content of repetitive DNA in avian genomes is considerably less than in other investigated vertebrates. The first descriptions of tandem repeats were based on the results of routine biochemical and molecular biological experiments. Both satellite DNA and interspersed repetitive elements were annotated using library-based approach and de novo repeat identification in assembled genome. The development of deep-sequencing methods provides datasets of high quality without preassembly allowing one to annotate repetitive elements from unassembled part of genomes. In this work, we search the chicken assembly and annotate high copy number tandem repeats from unassembled short raw reads. Tandem repeat (GGAAA)n has been identified and found to be the second after telomeric repeat (TTAGGG)n most abundant in the chicken genome. Furthermore, (GGAAA)n repeat forms expanded arrays on the both arms of the chicken W chromosome. Our results highlight the complexity of repetitive sequences and update data about organization of sex W chromosome in chicken.


July 7, 2019  |  

Current advances in genome sequencing of common wheat and its ancestral species

Common wheat is an important and widely cultivated food crop throughout the world. Much progress has been made in regard to wheat genome sequencing in the last decade. Starting from the sequencing of single chromosomes/chromosome arms whole genome sequences of common wheat and its diploid and tetraploid ancestors have been decoded along with the development of sequencing and assembling technologies. In this review, we give a brief summary on international progress in wheat genome sequencing, and mainly focus on reviewing the effort and contributions made by Chinese scientists.


July 7, 2019  |  

The challenge of analyzing the sugarcane genome.

Reference genome sequences have become key platforms for genetics and breeding of the major crop species. Sugarcane is probably the largest crop produced in the world (in weight of crop harvested) but lacks a reference genome sequence. Sugarcane has one of the most complex genomes in crop plants due to the extreme level of polyploidy. The genome of modern sugarcane hybrids includes sub-genomes from two progenitors Saccharum officinarum and S. spontaneum with some chromosomes resulting from recombination between these sub-genomes. Advancing DNA sequencing technologies and strategies for genome assembly are making the sugarcane genome more tractable. Advances in long read sequencing have allowed the generation of a more complete set of sugarcane gene transcripts. This is supporting transcript profiling in genetic research. The progenitor genomes are being sequenced. A monoploid coverage of the hybrid genome has been obtained by sequencing BAC clones that cover the gene space of the closely related sorghum genome. The complete polyploid genome is now being sequenced and assembled. The emerging genome will allow comparison of related genomes and increase understanding of the functioning of this polyploidy system. Sugarcane breeding for traditional sugar and new energy and biomaterial uses will be enhanced by the availability of these genomic resources.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.