An effective sequencealignmentfree superpositioning of pairwise or multiple structures with missing data
 Jianbo Lu^{1, 2},
 Guoliang Xu^{3},
 Shihua Zhang^{3}Email author and
 Benzhuo Lu^{3}Email author
DOI: 10.1186/s1301501600793
© The Author(s) 2016
Received: 22 October 2015
Accepted: 18 May 2016
Published: 21 June 2016
Abstract
Background
Superpositioning is an important problem in structural biology. Determining an optimal superposition requires a onetoone correspondence between the atoms of two proteins structures. However, in practice, some atoms are missing from their original structures. Current superposition implementations address the missing data crudely by ignoring such atoms from their structures.
Results
In this paper, we propose an effective method for superpositioning pairwise and multiple structures without sequence alignment. It is a twostage procedure including data reduction and data registration.
Conclusions
Numerical experiments demonstrated that our method is effective and efficient. The code package of protein structure superposition method for addressing the cases with missing data is implemented by MATLAB, and it is freely available from: http://sourceforge.net/projects/pssm123/files/?source=navbar
Keywords
Superposition Protein structure alignment Iterative closest pointBackground
Superposition is a frequently used method to measure spatial similarity of threedimensional objects such as computer vision, image science and molecular biology. Molecular biology employs superposition to support a wide variety of tasks. It is a very important problem to superimpose two or more protein structures in structural bioinformatics. Superpositioning problems have been explored by many studies [1–5]. The optimal superposition of threedimensional (3D) conformations of similar structures is necessary in many real cases. Determining an optimal superposition normally requires a onetoone correspondence between the atoms in the different structures [6]. The superposition of multiple structures’ situation is complicated by the fact that if structure X is superimposed on structure Y and structure Z is superimposed on structure Y, then, in general, structure X is not optimally superimposed on structure Z. In this case, the superposition of X on Z is only optimal superposition if two of the three structures are identical in shape.
A superposition is a particular orientation of objects in threedimensional space. There are many approaches to solve this problem. One of the approaches to solve the superpositioning problem is the method proposed by Kabsch [3], which allows computing the optimal transformation via singular value decomposition of a covariance matrix derived from the coordinates of the corresponding threedimensional structure. Another approach for this problem proposed by Kearsley [7] uses the algebra of quaternions. Multiple structure superposition programs have many applications, including understanding evolutionary conservation and divergence, functional prediction, automated docking, comparative modeling, protein and ligand design, construction of benchmark data sets and protein structure prediction and so on [8–11].
Structure alignment is different from superposition of structures. A structural alignment is the assignment of amino acid residueresidue correspondences between similar structural proteins [12]. One way to represent an alignment is using the familiar row and column matrix format, in which sequence alignments use single letter abbreviations for residues. Alignments of amino acid sequences of proteins play important roles in structure molecular biology such as the study of evolution in protein families, the identification of patterns of conservation in sequences, homology modeling, and protein crystal structure solution by molecular replacement.
In molecular biology, corresponding residues have similar structures. Many homologous proteins share a common core structure, in which the chain retains the topology of its folding pattern, but varies in geometric details. This retained similarity makes it possible to align the residues of the core. Since the structure of many proteins is still unknown and proteins with similar structural motifs often exhibit similar biological properties even when they are distantly related, structure alignment can help characterize the role of many proteins.
There are two ways for protein structure alignments, sequencebased alignments and nonsequencebased alignments (i.e. Structal [13], TMalign [13], LovoAlign [13]). For closely related proteins, sequencebased alignments give consistent answers, reflecting evolutionary divergence. For distantly related proteins, however, sequencebased alignments lead to diverse residue correspondences. At this case, we need nonsequencebased alignments. Nonsequential alignments can handle many cases such as reordering of domains and circular permutations [13–15].
Most multiple structure alignment programs are based on pairwise structural alignment programs [16, 17]. Even simplified variants of structure alignment are known to be NPhard [18, 19]. In many cases, certain residues are missing. For example, one crystal structure of a protein may omit loop regions that are present in another crystal structure of the same protein [20]. Most of the multiple structural alignment methods divide it into two subproblems. The first is to identify multiple corresponding structural elements. The second is to calculate the appropriate rigidbody transformation for each structure to create an optimal superposition.
There are three broad classes for structure alignment programs: the first class is aligned fragment pair (AFP) chaining methods [21]. The second class [22], is distance matrix methods. The third class includes everything else, such as geometric hashing and methods using secondary structural elements [22]. THESEUS is a software to consider the missing data by adopting an expectationmaximization (EM) algorithm [23]. However, EM algorithm relies on a sequential structure alignment and it is highly dependent on the choice of the initial value. In this paper, we propose a new method for nonsequential structure superposition. We use the combination of principal component analysis (PCA) and iterative closest point (ICP) registration techniques. The point of our method is we treat the proteins as the whole structures.
In this work, we propose a simple and efficient protein structure superposition method for addressing the cases with missing data (PSSM). We adopt a twostage procedure including data reduction and registration techniques to address this problem. We have applied it to the cytochrome C data, Globins family data, Serine Proteinases family data, Fisher’s dataset and the simulated data to demonstrate its efficiency and accuracy.
Methods
Here we introduce a twostage method for the optimal superposition of pairwise and multiple structures with incomplete data. In the first stage, the key is to adopt a data reduction technique to get a reduced representation which is not sensitive to the noise and the missing residues. Based on the representation, we can obtain a rough superposition of pairwise or multiple structures with a least square technique. In the second stage, we employ the powerful iterative closest point (ICP) algorithm to further refine the superposition and find the optimal solution (Fig. 1).
The iterative closest point algorithm, originally introduced in the area of computer vision for image registration, can be used in bioinformatics for the alignment of complete protein structures. Bertolazzi [24] used this method for the structural alignment of protein surfaces.
Discovering rough superpositioning based on principalaxes transform
In this section we introduce the principal component analysis, the principalaxes transform techniques and the rotational search needed for some cases.
principal component analysis
Principal component analysis (PCA) is a very popular subspace analysis technique which is successfully applied in many domains for dimension reduction. It helps you reduce the number of variables in an analysis by describing a series of uncorrelated linear combinations of the variables that contain most of the variance. This reduction is achieved by transforming the original variables to the uncorrelated principal components—new variables. This new variables are ordered so that the first few ones keep the most of the variation in all of the original variables.
The computation of principal components can give the principal component of the points. Then, we rotate the points along this principal component. This allows us to get the best initial value of the points. After this step, we employ the iterative closest point algorithm to further refine the superposition and find the optimal solution.
Principalaxes transform
Rotational search strategy
The principalaxes transform is expected to yield correct rough superpositioning for many initial values. However, it may fail to produce proper ones in some cases. We consider a rotational search strategy to improve this situation to test multiple orientations. The axis of rotation is a line which goes through points (0, 0, 0) (geometric center) and u (the linear combination of eigenvectors of one protein). The interval degree is set as \(10^{\circ }\). In practice, the principalaxes alignment method is applied first, followed by a rotational search if the resulting structure superpositioning does not give satisfactory results below a given RMSD (root mean squared deviation) value, then the principalaxes alignment method is applied again.
Structures with random rotations
To show the effectiveness of PSSM method, we use random rotational matrices to generate a random corresponding structure. A random rotational orthogonal matrix is generated by a MATLAB function [i.e., orth(rand(3,3))]. As we know, the rotational matrices change the points’ position and orientation.
Refining the superpositioning based on iterative closest point algorithm
 1.
Given the point set P with \(N_p\) points \({\vec {p}}\) from the data shape and the model shape X.
 2.The iteration is initialized by setting \(P_0=P,~\vec q_0=[1,0,0,0,0,0,0]^T\) and \(k=0\). The registration vectors are defined relative to the initial data set \(P_0\) so that the final registration represents the complete transformation. Steps (a)–(d) in the following are applied until convergence within a tolerance \(\tau \).
 (a)
Compute the closest points: \(Y_k=\mathcal {C}(P_k,X)\), where \(\mathcal {C}\) denotes the closest point operator.
 (b)
Compute the registration: \((\vec {q}_{k},d_{k})=\mathcal {Q}(P_{0},Y_{k}).\)
 (c)
Apply the registration: \(P_{k+1}=\vec {q}_{k}(P_{0})\).
 (d)
Terminate the iteration when the change in meansquare error falls below a preset positive threshold \(\tau \) (i.e. \(\Vert d_{k}d_{k+1}\Vert <\tau \)), which specifies the desired precision of the registration, otherwise, set k = k+1, go to step (a).
 (a)
The combined procedure for pairwise and multiple structure superposition
The principal component analysis gives the principalaxes of each protein structure. The ICP algorithm is a powerful method for points registration. However, it is only converges to a local minimum value and is sensitive to the initial value. In the following, we introduce the combined procedure for the pairwise structure superposition in detail.
Data preprocessing is needed. We download proteins from the National Center for Biotechnology Information (NCBI) database or other database, and the format is Protein Data Bank (PDB). We extract 3dimensional coordinate and put the data into txt format. The Matlab program runs on the system of windows7, with AMD Athlon(tm) P340 DualCore Processor.
 1.
Input the proteins structure data \(P_a\), \(P_b\), set initial value k = 1.
 2.
Employ principal component analysis to find the principal components. For each of the two proteins \(P_a\) and \(P_b\), the eigenvectors and eigenvalues is calculated (\(u_1, u_2, u_3\) for \(P_a\) and \(v_1, v_2, v_3\) for \(P_b\)), and the geometric center is determined.
 3.
The protein \(P_b\) is rotated. The rotating axis goes through O (0, 0, 0) (geometric center) and parallels to the vector v (here, v is \(v_1\) or \(v_1\underline{+}v_2\) or \(v_1\underline{+}v_3\)). The interval degree is set to \(10^{\circ }\).
 4.
For each rotated position of \(P_b\), the eigenvectors and eigenvalues is calculated again. The principalaxes of the new \(P_b\) and \(P_a\) is aligned using least square method.
 5.
The ICP algorithm is applied.
 6.
If RMSD \(<c\) (e.g., \(c=1.5\)) or number of iterations exceeds certain times, output the cumulative rotation matrix and translation vector, break; Else, go back to 3.
 7.
If RMSD \(>c\), (e.g., \(c=1.5\)) for the whole circle. Then we choose the smallest RMSD case, and output the rotation matrix and translation vector.
The details of our multiple algorithm are as follows:
 1.
Input the protein structures, \(C={P_1, P_2,\ldots , P_n}\), \(n\ge 3\).
 2.
Calculate the length of each protein and sort them by length.
 3.
Choose the middle sized protein as the template structure, denoted as \(M_i\), for each protein in C calling the pairwise proteins superposition algorithm, output the RMSD between this protein and the template and this protein’s number, denoted as set \(T_{i}\). The initial value i is equals 1.
 4.
For each protein in \(T_{i}\), sort by RMSD in ascending order. If the RMSD \(< c\) (e.g., \(c=1.5\)), we put the proteins and the corresponding RMSD in set \(S_{i}\). If the RMSD \(>c\), we put the proteins and the corresponding RMSD in set \(T_{i+1}\).
 5.
Choose the largest RMSD protein in set \(S_{i}\) as template \(M_{i+1}\), for each protein in \(T_{i+1}\) calling the pairwise proteins superposition algorithm, change RMSD in \(T_{i+1}\).
 6.
\(i \leftarrow i+1\), using step 4 and step 5, update \(M_i\), \(T_i\) and \(S_i\).
 7.
If \(T_{i}=T_{i+1}\) or \(T_{i}=0\), stop.
 8.
Output each protein rotation matrix R and translation vector T.
Performance metrics
There are two parameters to measure the quality of the protein structure superposition: the number of residues that are aligned in the superposition and the average pairwise root mean squared deviation (RMSD) between aligned atoms. Clearly, the goal is to minimize the RMSD while maximizing the number of residues used in the superposition. In the following sections, if we do not mention the number of points used in superposition, the number is the smaller one between a pair of proteins.
Results
In this section, we tested our method PSSM using both simulated data and protein structures from the PDB. We compared it with several typical methods including least square (LS), \(C_\alpha \)match [26], CPSARST [27], CCP4 [28], SuperPose [29] and MUSTANG [30].
Results of the simulated data
We used the protein structure d1cih (835) as an example, and generated four rotated structures with three random rotational orthogonal matrices \(r_1\), \(r_2\) and \(r_3\) and one specific matrix \(r_4\) representing a 90degreerotation around zaxis.
The superposition results of PSSM for two identical protein structures with one randomly generated by a rotation from another one
Structure data  Time (s)  RMSD (Å) 

\(v. v.*r_1\)  317.8  \(4.0628*10^{14}\) 
\(v. v.*r_2\)  1161.4  \(4.2752*10^{14}\) 
\(v. v.*r_3\)  27.3  \( 5.0009*10^{14}\) 
\(v. v.*r_4\)  2.3  \( 2.0260*10^{14}\) 
Comparison between PSSM and LS
Structure name  Time (s)  RMSD (Å)  

id1(size)–id2(size)  LS  PSSM  LS  LS 
d1cih (108)–d1lfma (103)  0.002  0.331  0.6  0.6 
d1cih (108)–d2pcbb (104)  0.007  9.418  0.7  0.7 
d1cih (108)–d1m60a (104)  0.006  21.484  1.2  1.2 
d2pcbb (104)–d1m60a (104)  0.007  7.394  1.3  1.3 
d1cih (108)–d1kyow (108)  0.002  3.768  0.7  0.7 
Comparison of PSSM with \(C_\alpha \)match and CPSARST
PDB/SCOP entries  PSSM  \(C_\alpha \)match  CPSARST  

id1(size)–id2(size)  Aligned  RMSD (Å)  Aligned  RMSD (Å)  Aligned  RMSD (Å) 
1nls (237)–2bqpA (228)  228  1.4  214  1.3  218  1.4 
1glh (214)–1cpn (208)  208  0.7  206  0.5  206  0.5 
1yadA (190)–2duaA (283)  190  2.6  130  1.7  151  2.4 
1zbdA (177)–1pujA (261)  177  3.2  113  1.5  130  3.2 
d1nkla (78)–d1qdma1 (77)  77  2.6  49  1.4  70  2.4 
Comparison of PSSM with CCP4 and SuperPose
PDB/SCOP entries  PSSM  CCP4  SuperPose  

id1(size)–id2(size)  Aligned  RMSD (Å)  Aligned  RMSD (Å)  Aligned  RMSD (Å) 
1nls_ (237)–2bqpA (228)  228  1.4  114  1.0  205  18.1 
1glh_ (214)–1cpn_ (208)  208  0.7  156  0.4  156  0.4 
1yadA (190)–2duaA (283)  190  2.6  157  2.4  183  10.6 
1zbdA (177)–1pujA (261)  177  3.2  97  2.0  177  20.0 
We compare our PSSM method with CCP4 and SuperPose (Table 4) and find that each method has its own advantage. We adopt four pairs of proteins including 1nls and 2bqp, 1glh and 1cpn, 1yad and 2dua, 1zbd and 1puj as testing system. Take 1nls and 2bqp as an example, PSSM gets 228 aligned residues (\(C_\alpha \)) with RMSD of 1.4Å, CCP4 gets 114 aligned 114 residues with RMSD of 0.999Å and SuperPose gets 205 aligned residues with RMSD of 18.14Å. Compared with CCP4 and SuperPose, PSSM gets more aligned residues, and gives reasonable and competitive RMSD compared those obtained by CCP4, and demonstrates overall better results than SuperPose. A possible reason is that SuperPose uses a secondary structural alignment strategy to guide the superposition. It is proper for secondary structural alignment and good at detecting domain or hinge motions in proteins. While our method is designed for the full structure superposition (see more examples in Additional file 1: Tables S2 and S3).
Performance comparison on Fischer’s dataset
Fischer’s dataset (67 of 68 pairs)  DALI  MATT  PSSM 

aveAligned  155  152  186 
aveRMSD (Å)  2.77  2.87  2.90 
The usability of PSSM algorithm
From the above analysis, we can see that PSSM for pairwise structure superposition is relatively robust for the case with random missing data than with sequential missing data. From the two cases above and more cases we run, we find that PSSM requires the difference between the two protein lengths less than about 20 %, for structure superposition with missing data.
Multiple protein structure superposition
The RMSD of pairwise superposition between d1cih and others with PSSM for cytochrome C
PDBid1 (size)–PDBid2 (size)  Time (s)  RMSD (Å) 

d1cih (835)–d1crj (847)  2.301  0.3829 
d1cih (835)–d1csu (846)  2.685  0.3881 
d1cih (835)–d1csx (846)  2.674  0.4852 
d1cih (835)–d1yeb (847)  3.108  0.7979 
d1cih (835)–d1kyow (850)  48.480  0.9363 
d1cih (835)–d1lfma (800)  6.399  1.0420 
d1cih (835)–d2pcbb (823)  424.890  1.1760 
d1cih (835)–d1u74d (847)  1196.996  0.8338 
d1cih (835)–d1m60a (819)  754.727  1.4786 
The RMSD of pairwise superposition between 2pka and others with PSSM for serine proteinases data set
PDBid1 (size)–PDBid2 (size)  Time (s)  RMSD (Å) 

2pka (232)–3est (240)  397.0237  1.5222 
2pka (232)–1ton (227)  300.4844  1.3310 
2pka (232)–3rp2 (224)  460.3419  1.5825 
2pka (232)–4ptp (223)  236.3811  1.1994 
2pka (232)–5cha (236)  454.4466  1.7583 
2pka (232)–1ppb (295)  542.5367  2.9835 
The RMSD of pairwise superposition between 2dhbb and others with PSSM for Globins data set
PDBid1 (size)–PDBid2 (size)  Time (s)  RMSD (Å) 

2dhbb (146)–1hhoa (141)  26.9685  1.4944 
2dhbb (146)–1hhob (146)  0.2768  1.0898 
2dhbb (146)–2dhba (141)  43.1975  1.4393 
2dhbb (146)–1mbd (153)  15.4869  1.4735 
Comparison of PSSM with MUSTANG using the Globins and Serine Proteinases data sets
Data sets  PDB codes  PSSM  MUSTANG  

RMSD (Å)  Aligned  RMSD (Å)  Aligned  
Globins (5)  1hhoa, 2dhba, 1hhob, 2dhbb, 1mbd  1.37  141  1.41  139 
Serine Proteinases (7)  3est, 2pka, 1ton, 3rp2, 4ptp, 5cha, 1ppb  1.72  223  1.56  205 
Conclusion
We have proposed an effective method PSSM for superpositioning pairwise and multiple structures with missing data. The method does not need a sequence alignment in advance. It employs the principal component analysis to find the initial rough superposition, and then uses an iterative closest point algorithm for refining and getting accurate registration. According to what we’ve known, this is the first time to combine PCA and ICP algorithm to study the problem of nonsequential superposition. Numerical experiments demonstrate its accuracy and effectiveness. This method has the comparable accuracy as the least square method which is a classical method for protein structure superposition. However, the least square method needs the sequence alignment.
Abbreviations
 3D:

threedimensional
 EM:

Expectation Maximization
 PSSM:

Protein Structure Superposition method for addressing the cases with Missing data
 ICP:

iterative closest point
 PCA:

principal component analysis
 RMSD:

root mean squared deviation
 NCBI:

National Center for Biotechnology Information
 PDB:

Protein Data Bank
 LS:

least square
 CPSARST:

Circular Permutation Search Aided by Ramachandran Sequential Transformation
 CCP4:

Collaborative Computational Project Number 4
 MUSTANG:

MUltiple STructural AligNment AlGorithm
 CAS:

Chinese Academy of Sciences
Declarations
Authors' contributions
JL, SZ and BL conceived and designed this study; JL implemented the algorithm and carried out the experiment; JL, GX, SZ and BL analyzed the data, wrote the paper and approved the final manuscript. All authors read and approved the final manuscript.
Acknowledgements
We would like to thank Drs. Lingyun Wu, Shiyang Bai and Bin Tu for their helpful discussions.
Competing interests
The authors declare that they have no competing interests.
Funding
This project was supported by the National Natural Science Foundation of China (No. 91530102, 21573274, 11321061 and 61379092), the Outstanding Young Scientist Program of Chinese Academy of Sciences (CAS), the CAS Program for Cross and Cooperative Team of the Science and Technology Innovation, the State Key Laboratory of Scientific/Engineering Computing, the Key Laboratory of Random Complex Structures and Data and the National Center for Mathematics and Interdisciplinary Sciences at CAS.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Diamond R. On the comparison of conformations using linear and quadratic transformations. Acta Crystallogr A. 1976;32:1–10.View ArticleGoogle Scholar
 Cohen G. Align: a program to superimpose protein coordinates, accounting for insertions and deletions. J Appl Crystallogr. 1997;30:1160–1.View ArticleGoogle Scholar
 Kabsch W. A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallogr Sect A: Crystal Phys Diffract Theor General Crystallogr. 1978;34:827–8.View ArticleGoogle Scholar
 Coutsias EA, Seok C, Dill KA. Using quaternions to calculate rmsd. J Comput Chem. 2004;25:1849–57.View ArticlePubMedGoogle Scholar
 Theobald DL, Wuttke DS. Accurate structural correlations from maximum likelihood superpositions. PLoS Comput Biol. 2008;4:43.View ArticleGoogle Scholar
 Flower DR. Rotational superposition: a review of methods. J Mol Graph Model. 1999;17:238–44.PubMedGoogle Scholar
 Kearsley SK. On the orthogonal transformation used for structural comparisons. Acta Crystallogr Sect A: Foundations Crystallogr. 1989;45:208–10.View ArticleGoogle Scholar
 Irving J, Whisstock JC, Lesk AM. Protein structural alignments and functional genomics. Proteins. 2001;42:378–82.View ArticlePubMedGoogle Scholar
 Edgar R, Batzoglou S. Multiple sequence alignment. Curr Opin Struct Bio. 2006;16:368–73.View ArticleGoogle Scholar
 Dunbrack RL. Sequence comparison and protein structure prediction. Curr Opin Struct Biol. 2006;16:274–84.View ArticleGoogle Scholar
 Panchenko A, MarchlerBauer A, Bryant SH. Threading with explicit models for evolutionary conservation of structure and sequence. Proteins. 1999;S3:133–40.View ArticleGoogle Scholar
 Martinez L, Andreani R, Martinez J. Convergent algorithms for protein structural alignment. BMC Bioinformatics. 2007;8:306.View ArticlePubMedPubMed CentralGoogle Scholar
 Grishin NV. Fold change in evolution of protein structures. J Struct Biol. 2001;134:167–85.View ArticlePubMedGoogle Scholar
 Zuker Somorjai. The alignment of protein structures in three dimensions. Bull Math Biol. 1989;51:57–78.View ArticleGoogle Scholar
 Sujatha S, Balaji S, Srinivasan N. Pali: a database of alignments and phylogeny of homologous protein structures. Bioinformatics. 2001;17:375–6.View ArticlePubMedGoogle Scholar
 Ye Godzik. Multiple flexible structure alignment using partial order graphs. Bioinformatics. 2005;21:2362–9.View ArticlePubMedGoogle Scholar
 Torarinsson E. Multiple structural alignment and clustering of rna sequences. Bioinformatics. 2007;23:926–32.View ArticlePubMedGoogle Scholar
 Goldman D, Istrail S, Papadimitriou CH. Algorithmic aspects of protein structure similarity. In: P B, editor. Proceedings of the 40th Annual Symposium on Foundations of Computer Science. Los Alamitos: IEEE Computer Society; 1999. p. 512–22.
 Wang L, Jiang T. On the complexity of multiple sequence alignment. J Comput Biol. 1994;1:512–22.Google Scholar
 Barthel D, Hirst J, Bazewicz J, Burke E, Krasnogor N. Procksi: a decision support system for protein (structure) comparison, knowledge, similarity and information. BMC Bioinformatics. 2007;8:416.View ArticlePubMedPubMed CentralGoogle Scholar
 Menke M. Matt: local flexibility aids protein multiple structure alignment. PLoS Comput Biol. 2008;4:10.View ArticleGoogle Scholar
 Dror O. Multiple structural alignment by secondary structures: algorithm and applications. Protein Sci. 2003;12:2492–507.View ArticlePubMedPubMed CentralGoogle Scholar
 Theobald DL, Steindel PA. Optimal simultaneous superpositioning of multiple structures with missing data. Bioinformatics. 2012;28:1972–9.View ArticlePubMedPubMed CentralGoogle Scholar
 Bertolazzi P, Guerra C, Liuzzi G. A global optimization algorithm for protein surface alignment. BMC Bioinformatics. 2010;11:488.View ArticlePubMedPubMed CentralGoogle Scholar
 Besl PJ, McKay ND. A method for registration of 3d shapes. IEEE Trans Pattern Anal Mach Intell. 1992;14:239–56.View ArticleGoogle Scholar
 Bachar O, Fischer D, Nussinov R, Wolfson H. A computer vision based technique for 3d sequenceindependent structural comparison of proteins. Protein Eng Design Selection. 1993;6:279–87.View ArticleGoogle Scholar
 Lo WC, Lyu PC. Cpsarst: an efficient circular permutation search tool applied to the detection of novel protein structural relationships. Genome Biol. 2008;9:11.View ArticleGoogle Scholar
 Winn M, Ballard C, Cowtan K, Dodson E, Emsley P, Evans P, Keegan R, Krissinel E, Leslie A, McCoy A, McNicholas S, Murshudov G, Pannu N, Potterton E, Powell H, Read R, Vagin A, Wilson K. Overview of the ccp4 suite and current developments. Acta Crystallogr D Biol Crystallogr. 2011;67:235–42.View ArticlePubMedPubMed CentralGoogle Scholar
 Rajarshi M, Domselaar GV, Zhang H, David S. Superpose: a simple server for sophisticated structural superposition. Nucleic Acids Res. 2004;32:W590–4.View ArticleGoogle Scholar
 Konagurthu AS, Whisstock JC, Stuckey PJ, Lesk AM. Mustang: a multiple structural alignment algorithm. Proteins. 2006;64:559–74.View ArticlePubMedGoogle Scholar