在原子水平上发展了一种距离相关的用于研究蛋白质-蛋白质相互作用的平均势(potential of mean force,简称PMF)方法.与传统理论模型相比,我们的模型考虑了蛋白质系统的复杂环境因素.这种改进使得该模型能够给出物理上更合理和准确的势函数形式.得到这样的势函数是正确描述蛋白质结构及相互作用的前提条件.而且借助于改进后的方法,还可以对蛋白质中残基相互作用的空间拓扑规律进行研究.期望这种改进将促进平均势方法在蛋白质科学其他领域,如蛋白质折叠识别,结构预测及热稳定性预测中的应用和发展.
Sequence alignment is a common method for finding protein structurally conserved/similar regions. However, sequence alignment is often not accurate if sequence identities between to-be-aligned se- quences are less than 30%. This is because that for these sequences, different residues may play similar structural roles and they are incorrectly aligned during the sequence alignment using substitu- tion matrix consisting of 20 types of residues. Based on the similarity of physicochemical features, residues can be clustered into a few groups. Using such simplified alphabets, the complexity of protein sequences is reduced and at the same time the key information encoded in the sequences remains. As a result, the accuracy of sequence alignment might be improved if the residues are properly clustered. Here, by using a database of aligned protein structures (DAPS), a new clustering method based on the substitution scores is proposed for the grouping of residues, and substitution matrices of residues at different levels of simplification are constructed. The validity of the reduced alphabets is confirmed by relative entropy analysis. The reduced alphabets are applied to recognition of protein structurally conserved/similar regions by sequence alignment. The results indicate that the accuracy or efficiency of sequence alignment can be improved with the optimal reduced alphabet with N around 9.
LI Jing1 & WANG Wei1,2 1 National Laboratory of Solid State Microstructure and Department of Physics, Nanjing University, Nanjing 210093, China