A common approach to understand the properties of a protein is to compare it to other proteins. Proteins that are similar, in terms of either their amino acid sequences or 3dimensional structures, often share similar functions, or are related evolutionarily. The latter, structural comparison, is particularly interesting since protein structures are known to be more evolutionarily conserved than the biological sequences which encode them. Furthermore, proteins of similar structures may have similar functionality, even when their sequences differ [1].
Structural comparison is typically a problem of aligning two sets of 3dimensional coordinates. (In most of the known structural alignment problems, each point is the 3D coordinates of the C
α atom, one per residue. Hence, a structure can be modeled for structural alignment purpose as a sequence of 3D points.) The alignment usually involves a rigid transformation to superimpose the two sequences of points, and a mapping which specifies the matched points. The parameters to optimize in the alignment may differ in different situations, because it is not easy to single out a set of parameters that best captures the similarity between two given structures [
2]. In many situations, the alignment needs not match between every point in the two sequences. At present, there is a consensus among molecular biologists in the use of the following two parameters [
2–
4]:
 1.
the number of residues (points) or percentage of total residues (points) matched in the alignment.
 2.
the root mean square deviation (RMSD) of the matched residues (points).
In general, the RMSD need not be minimized. It suffices that it is within a reasonable threshold. Hence, a good alignment is customarily taken to be one which maximizes the number of residue matches, within a given RMSD threshold. Many structural alignment methods are based on this principle. The computational complexity of finding an optimal solution to the problem is not well understood. Shibuya et al. formulated a restricted version of the problem, and showed the problem to NPhard when the dimensionality is arbitrary. It is open whether their problem is NPhard in 3dimension [5]. Other problems related to structural comparison based on the RMSD have been found to be difficult. For example, the problem of finding a substructure from multiple 3dimensional structures which minimizes the total RMSD, is NPhard [6].
For the variants of the alignment problem that are not based on the RMSD, we have the following results. When the objective is to maximize the number of point matches which are no more than a threshold distance apart, the problem is solvable in O(n^{32.5}) time, where n is the number of points [7]. The contact map overlap problem, where a graph is created out of each structure, and the problem is one of comparing the two graphs, is NPhard [8], and remains NPhard even when we require points that are matchable to be within a threshold distance [9]. These results, together with an early result which shows a related problem called threading to be NPhard [10], have traditionally led molecular biologists to believe that the structural alignment problem is difficult in general (e.g. [11–13]), even though a PTAS exists for the problem under a broad class of distance measures [14]. Heuristic algorithms have also been proposed for many variants of structural alignment problem [15–23]. While these methods perform reasonably well in general, they provide no guarantee on the quality of their results.
As noted by Shibuya et al., relatively few theoretical results have been obtained on problems defined over the RMSD, and the general problem of structural alignment under the RMSD remains open [5]. At present, whether the problem is intractable or not is not only of theoretical interests but also of practical concerns, due to advances in protein structure prediction which requires the comparison of very numerous structures. In this paper we show mathematical insights and techniques which we hope will lead to practical algorithms for the problem.
We first show that the difficulty of the problem does not lie solely in the individual components of their requirement. More precisely, then the problem can be solved in polynomial time.

if either a mapping that contains the optimal mapping is known (Theorem 3), or

if the optimal superposition is known (Lemma 1),
Our study shows that the difficulty of the LCP problem is also very much due to the two factors: (1) the problem allows the input coordinates to be of any arbitrary precision, and (2) it assumes no limit on the distance between two consecutive Cα atoms.
We consider the case where the input coordinates are integral, and the distance between two consecutive points is restricted. The first requirement is practical since in protein structures, coordinates are typically specified to a fixed precision (e.g. three decimal places in protein structures [
24]), and can be trivially scaled up to integral values. Similar assumptions are made in Euclidean problems such as the Euclidean TSP [
25]. The second requirement likewise does not add any restriction to the problem of protein structure alignment, since there is a natural upper bound (∼3.8Å) to the distance between two C
α atoms. In this case, the following results hold.

Given a polynomial time algorithm for finding a largest alignment of RMSD below a threshold d, one can efficiently compute an alignment of a given size ℓ which minimizes the RMSD (Theorem 7). (Since the other direction is easy, this shows that the two problems are of similar difficulty.)

The structural alignment problem under the RMSD is solvable exactly in polynomial time (Theorem 10).