Cysteine Motif Database Tool under review by Handawi

Cysteine Motif Database (CMD)

Computational approaches to disulfide bonding state and its connectivity pattern prediction are based on various descriptors. One generated descriptor is based on the sequence’s amino acid composition and flanking residues around the cysteine residue. These immediate residues have been shown to influence the cysteine redox potential and the cysteine’s steric accessibility. Since its proposal as a descriptor in 1990, these sequence motifs have been fed into various prediction methods such as machine learning approaches (i.e statistical methods, neural networks (NNs), support vector machine (SVM) and has been the basis of various prediction tools such as DiaNNA, DISULFIND, DCON and CysView.

However, there is currently no database that stores these disulphide motifs. Motivated by this absence and its usefulness in predicting cysteine bonding state and connectivity prediction, we have developed Cysteine Motif Database (CMD) as a database to store cysteine Motifs. Creation of a motif miner in CMD will allow the extraction of Motifs, store and study its bonding and connectivity propensities. Examining these sequences would allow researchers to study the composition propensity and its role in determining the bonding state. The expansion of RCSB and the increase of PDB files have significantly increased the number of motif beyond what has been utilised in prior research. We extracted 878000 cysteine motifs from which the users can now query more 77,000 unique cysteine motifs and cysteine pairing motifs generated from PDB and UniProt files. CMD query types include PDB ID, UniProt ID, sequence and motifs. These datasets are downloadable and parseable using web service and API. We plan to present CMD as a publicly available tool that would complement existing tools and composition analysis that uses similar motifs scheme.

Bioinformatic Research Group (BIRG) © Copy Right 2011 All Rights Reserved