Inverse folding of RNA pseudoknot structures
 James ZM Gao^{1},
 Linda YM Li^{1} and
 Christian M Reidys^{1}Email author
https://doi.org/10.1186/17487188527
© Gao et al; licensee BioMed Central Ltd. 2010
Received: 5 May 2009
Accepted: 23 June 2010
Published: 23 June 2010
Abstract
Background
RNA exhibits a variety of structural configurations. Here we consider a structure to be tantamount to the noncrossing WatsonCrick and GUbase pairings (secondary structure) and additional crossserial base pairs. These interactions are called pseudoknots and are observed across the whole spectrum of RNA functionalities. In the context of studying natural RNA structures, searching for new ribozymes and designing artificial RNA, it is of interest to find RNA sequences folding into a specific structure and to analyze their induced neutral networks. Since the established inverse folding algorithms, RNAinverse, RNASSD as well as INFORNA are limited to RNA secondary structures, we present in this paper the inverse folding algorithm Inv which can deal with 3noncrossing, canonical pseudoknot structures.
Results
In this paper we present the inverse folding algorithm Inv. We give a detailed analysis of Inv, including pseudocodes. We show that Inv allows to design in particular 3noncrossing nonplanar RNA pseudoknot 3noncrossing RNA structuresa class which is difficult to construct via dynamic programming routines. Inv is freely available at http://www.combinatorics.cn/cbpc/inv.html.
Conclusions
The algorithm Inv extends inverse folding capabilities to RNA pseudoknot structures. In comparison with RNAinverse it uses new ideas, for instance by considering sets of competing structures. As a result, Inv is not only able to find novel sequences even for RNA secondary structures, it does so in the context of competing structures that potentially exhibit crossserial interactions.
1 Introduction
Despite them playing a key role in a variety of contexts, pseudoknots are excluded from largescale computational studies. Although the problem has attracted considerable attention in the last decade, pseudoknots are considered a somewhat "exotic" structural concept. For all we know [7], the ab initio prediction of general RNA pseudoknot structures is NPcomplete and algorithmic difficulties of pseudoknot folding are confounded by the fact that the thermodynamics of pseudoknots is far from being well understood.
The notion of mfestructure is based on a specific concept of pseudoknot loops and respective loopbased energy parameters. This thermodynamic model was conceived by Tinoco and refined by Freier, Turner, Ninio, and others [13, 20–24].
1.1 knoncrossing, σcanonical RNA pseudoknot structures
Here, vertices and arcs correspond to the nucleotides A, G, U, C and WatsonCrick (AU, GC) and (UG) base pairs, respectively.
RNA secondary structures exhibit no crossings in their diagram representation, see Figure 3 and Figure 4, and are therefore 2noncrossing diagrams satisfying some minimum arclength condition. An RNA pseudoknot structure is therefore a knoncrossing diagram for some k satisfying some minimum arclength condition.
As a natural generalization of RNA secondary structures knoncrossing RNA structures [27–29] were introduced. A knoncrossing RNA structure of length n is knoncrossing diagram over [n] without arcs of the form (i, i + 1). In the following we assume k = 3, i.e. in the diagram representation there are at most two mutually crossing arcs, a minimum arclength of four and a minimum stacksize of three base pairs. The notion knoncrossing stipulates that the complexity of a pseudoknot is related to the maximal number of mutually crossing bonds. Indeed, most natural RNA pseudoknots are 3noncrossing [30].
1.2 Neutral networks
where the u_{ h }denotes the unpaired nucleotides and the p_{ h }= (s_{ i }, s_{ j }) denotes base pairs, respectively, see Figure 8. We can then view and as elements of the formal cubes and implying the new adjacency relation for elements of C[S].
Note, however, that a pneighbor has either Hamming distance one (GC ↦ GU) or Hamming distance two (GC ↦ CG). We call a u or a pneighbor, y, a compatible neighbor. In light of the adjacency notion for the set of compatible sequences we call the set of all sequences folding into S the neutral network of S. By construction, the neutral network of S is contained in C[S]. If y is contained in the neutral network we refer to y as a neutral neighbor. This gives rise to consider the compatible and neutral distance of the two sequences, denoted by C(s, s') and N(s, s'). These are the minimum length of a C[S]path and path in the neutral network between s and s', respectively. Note that since each neutral path is in particular a compatible path, the compatible distance is always smaller or equal than the neutral distance.
In this paper we study the inverse folding problem for RNA pseudoknot structures: for a given 3noncrossing target structure S, we search for sequences from C[S], that have S as mfe configuration.
2 Background
For RNA secondary structures, there are three different strategies for inverse folding, RNAinverse, RNASSD and INFORNA[35–37].
One common assumption in these inverse folding algorithms is, that the energies of specific substructures contribute additively to the energy of the entire structure. Let us proceed by analyzing the algorithms.
RNAinverse is the first inversefolding algorithm that derives sequences that realize given RNA secondary structures as mfeconfiguration. In its initialization step, a random compatible sequence s for the target T is generated. Then RNAinverse proceeds by updating the sequence s to s', s'' ... step by step, minimizing the structure distance between the mfe structure of s' and the target structure T. Based on the observation, that the energy of a substructure contributes additively to the mfe of the molecule, RNAinverse optimizes "small" substructures first, eventually extending these to the entire structure. While optimizing substructures, RNAinverse does an adaptive walk in order to decrease the structure distance. In fact, this walk is based entirely on random compatible mutations.
RNASSD inverse folds RNA secondary structures by initializing sequences using three specific subroutines. In the first a particular compatible sequence is generated, where noncomplementary nucleotides to bases adjacent to helical regions are assigned. In the second nucleotides located in unpaired positions as well as helical regions are assigned at random, using specific (nonuniform) probabilities. The third routine constitutes a mechanism for minimizing the occurrence of undesired but favourable interactions between specific sequence segments. Following these subroutines, RNASSD derives a hierarchical decomposition of the target structure. It recursively splits the structure and thereby derives a binary decomposition tree rooted in T and whose leaves correspond to Tsubstructures. Each nonleaf node of this tree represents a substructure obtained by merging the two substructures of its respective children. Given this tree, RNASSD performs a stochastic local search, starting at the leaves, subsequently working its way up to the root.
INFORNA constructs sequences folding into a given secondary structure by employing a dynamic programming method for finding a well suited initial sequence. This sequence has a lowest energy with respect to the T. Since the latter does not necessarily fold into T, (due to potentially existing competing configurations) INFORNA then utilizes an improved (relative to the local search routine used in RNAinverse) stochastic local search in order to find a sequence in the neutral network of T. In contrast to RNAinverse, INFORNA allows for increasing the distance to the target structure. At the same time, only positions that do not pair correctly and positions adjacent to these are examined.
2.1 Cross
Cross is an ab initio folding algorithm that maps RNA sequences into 3noncrossing RNA structures. It is guaranteed to search all 3noncrossing, σcanonical structures and derives some (not necessarily unique), loopbased mfeconfiguration. In the following we always assume σ ≥ 3. The input of Cross is an arbitrary RNA sequence s and an integer N. Its output is a list of N 3noncrossing, σcanonical structures, the first of which being the mfestructure for s. This list of N structures (C_{0}, C_{1}, ..., C_{N1}) is ordered by the free energy and the first listelement, the mfestructure, is denoted by Cross(s). If no N is specified, Cross assumes N = 1 as default.
All notions of minimal or maximal elements are understood to be with respect to ≺. An arc α ∈ is called a minimal, βcrossing if there exists no α' ∈ such that α' ≺ α. Note that α ∈ can be minimal βcrossing, while β is not minimal αcrossing. 3noncrossing diagrams exhibit the following four basic looptypes:
where (i, j) is an arc and [i, j] is an interval, i.e. a sequence of consecutive, isolated vertices (i, i + 1, ..., j  1, j).
where (i_{2}, j_{2}) is nested in (i_{1}, j_{1}). That is we have i_{1}< i_{2}< j_{2}< j_{1}.
where denotes the substructure over the interval [ω_{ h }, τ_{ h }], subject to the condition that if all these substructures are simply stems, then there are at least two of them, see Figure 6.
where i_{1} = min{i_{ h }} and j_{ t }= max{j_{ h }}, such that
 (i)
the diagram induced by the arcset P is irreducible, i.e. the dependencygraph of P (i.e. the graph having P as vertex set and in which α and α' are adjacent if and only if they cross) is connected and
 (ii)
for each (i_{ h }, j_{ h }) ∈ P there exists some arc β (not necessarily contained in P) such that (i_{ h }, j_{ h }) is minimal βcrossing.
(P2) Any i_{1} < x < j_{ t }, not contained in hairpin, interior or multiloops.
Having discussed the basic looptypes, we are now in position to state
Theorem 1 Any 3noncrossing RNA pseudoknot structure has a unique loopdecomposition[19].
In order to discuss the organization of Cross, we introduce the basic idea behind motifs and skeleta, combinatorial structures used in the folding algorithm.

its core, c(S) has no noncrossing arcs and

its Lgraph, L(S) is connected.
Having introduced motifs and skeleta we can proceed by discussing the general idea of Cross. The algorithm generates 3noncrossing RNA structure "from top to bottom" via the following three subroutines:
I (SHADOW): In this routine we generate all maximal stacks of the structure. Note that a stack is maximal with respect to ≺ if it is not nested in some other stack. This is derived by "shadowing" the motifs, i.e. their σstacks are extended "from top to bottom".
II (SKELETON BRANCH): Given a shadow, the second step of Cross consists in generating, the skeletatree. The nodes of this tree are particular 3noncrossing structures, obtained by successive insertions of stacks. Intuitively, a skeleton encapsulates all crossserial arcs that cannot be recursively computed. Here the tree complexity is controlled via limiting the (total) number of pseudoknots.
III (SATURATION): In the third subroutine each skeleton is saturated via DProutines. After the saturation the mfe3noncrossing structure is derived.
3 The algorithm
The inverse folding algorithm Inv is based on the ab initio folding algorithm Cross. The input of Inv is the target structure, T. The latter is expressed as a character string of ":( )[ ]{ }", where ":" denotes unpaired base and "( )", "[ ]", "{ }" denote paired bases.
In Algorithm 7.1, we present the pseudocodes of algorithm Inv. After validation of the target structure (lines 2 to 5 in Algorithm 7.1), similar to INFORNA, Inv constructs an initial sequence and then proceeds by a stochastic local search based on the loop decomposition of the target. This sequence is derived via the routine ADJUSTSEQ. We then decompose the target structure into loops and endow these with a linear order. According to this order we use the routine LOCALSEARCH in order to find for each loop a "proper" local solution.
3.1 ADJUSTSEQ
In this section we describe Steps 2 and 3 of the pseudocodes presented in Algorithm 7.1. The routine MAKESTART, see line 8, generates a random sequence, start, which is compatible to the target, with uniform probability.
We then initialize the variable seq_{min} via the sequence start and set the variable d = + ∞, where d denotes the structure distance between Cross(seq_{min}) and T.
Given the sequence start, we construct a set of potential "competitors", C, i.e. a set of structures suited as folding targets for start. In Algorithm 7.2 we show how to adjust the start sequence using the routine ADJUSTSEQ. Lines 3 to 36 of Algorithm 7.2, contain a Forloop, executed at most times. Here the looplength is heuristically determined.
For all computer experiments setting the Crossparameter N = 50, the subroutine executed in the loopbody consists of the following three steps.
Otherwise we do not update seq_{min} and go directly to Step II.
We proceed by keeping a, replacing the arc a by a nontrivial perturbation or remove a, arriving at a set of ten structures ν(S, a).
 1.
unpaired position: If p(T, w) = 0, we update randomly into the nucleotide , such that for each C _{ h }(λ ^{i1}) ∈ C(λ ^{i1}), either p(C _{ h }(λ ^{i1}), w) = 0 or is not compatible with where v = p(C _{ h }(λ ^{i1}), w) < 0, See position 6 in Figure 20.
 2.
startpoint: If p(T, w) < w, set v = p(T, w), We randomly choose a compatible base pair ( ) different from ( , ) such that for each C _{ h }(λ ^{i1}) ∈ C(λ ^{i1}), either p(C _{ h }(λ ^{i1}), w) = 0 or is not compatible with , where u = p(C _{ h }(λ ^{i1}), w) > 0 is the endpoint paired with in C _{ h }(λ ^{i1}) (Figure 20: (5, 9). The pair GC retains the compatibility to (5, 9), but is incompatible to (5, 10)). By Figure 21 we show feasibility of this step.
 3.
endpoint: If 0 < p(T, w) < w, then by construction the nucleotide has already been considered in the previous step.
Therefore, updating all the nucleotides of λ^{i1}, we arrive at the new sequence .
Note that the above mutation steps heuristically decrease the structure distance. However, the resulting sequence is not necessarily incompatible to all competitors. For instance, consider a competitor C_{ h }whose arcs are all contained T. Since λ^{ i }is compatible with T, λ^{ i }is compatible with C_{ h }. Since competitors are obtained from suboptimal folds such a scenario may arise.
In practice, this situation represents not a problem, since these type of competitors are likely to be ruled out by virtue of the fact that they have a mfe larger than that of the target structure.
Accordingly we have the following situation, competitors are eliminated due to two, equally important criteria: incompatibility as well as minimum free energy considerations.
If the distance of Cross(λ^{ i }) to T is less than or equal to d_{min} + 5, we return to Step I (with λ^{ i }). Otherwise, we repeat Step III (for at most 5 times) thereby generating and set where d(Cross( ), T) is minimal.
The procedure ADJUSTSEQ employs the negative paradigm [16] in order to exclude energetically close conformations. It returns the sequence seq_{middle} which is tailored to realize the target structure as mfefold.
3.2 DECOMPOSE and LOCALSEARCH
In this section we introduce two the routines, DECOMPOSE and LOCALSEARCH. The routine DECOMPOSE partitions T into linearly ordered energy independent components, see Figure 13 and Section 2.1. LOCALSEARCH constructs iteratively an optimal sequence for T via local solutions, that are optimal to certain substructures of T.
where the T_{ w }are the loops together with all arcs in the associated stems of the target.
 1.
T _{ w }is nested in T _{ h }, or
 2.
the startpoint of T _{ w }precedes that of T _{ h }.
projecting the loop T_{ w }onto the interval [l(T_{ w }), r(T_{ w })] and b_{ w }= [l', r'] ⊃ a_{ w }, being the maximal interval consisting of a_{ w }and its adjacent unpaired consecutive nucleotides, see Figure 13. Given two consecutive loops T_{ w }< T_{w + 1}, we have two scenarios:
LOCALSEARCH: Given the sequence of intervals I_{1}, I_{2}, ..., I_{ m }. We proceed by performing a local stochastic search on the subsequences (initialized via seq = seq_{middle} and where s_{[x, y]}= s_{ x }s_{x + 1}... s_{ y }). When we perform the local search on , only positions that contribute to the distance to the target, see Figure 10, or positions adjacent to the latter, will be altered. We use the arrays U_{1}, U_{2} to store the unpaired and paired positions of T. In this process, we allow for mutations that increase the structure distance by five with probability 0.1. The latter parameter is heuristically determined. We iterate this routine until the distance is either zero or some halting criterion is met.
4 Discussion
The main result of this paper is the presentation of the algorithm Inv, freely available at http://www.combinatorics.cn/cbpc/inv.html
The core of Inv is a stochastic local search routine which is based on the fact that each 3noncrossing RNA structure has a unique loopdecomposition, see Theorem 1 in Section 2.1. Inv generates "optimal" subsequences and eventually arrives at a global solution for T itself. Inv generalizes the existing inverse folding algorithm by considering arbitrary 3noncrossing canonical pseudoknot structures. Conceptually, Inv differs from INFORNA in how the start sequence is being generated and the particulars of the local search itself.
7 Appendix
7.1 Algorithm 7.1  INVERSEFOLD
Input: knoncrossing target structure T
Output: an RNA sequence seq
Require: k ≤ 3 and T is composed of ":( ) [ ] { }"
Ensure: Cross(seq) = T
1. ▻ Step 1: Validate structure
2. if false = CHECKSTRU(T) then
3. print incorrect structure
4. return NIL
5. end if
6.
7. ▻ Step 2: Generate the start sequence
8. start ← MAKESTART(T)
9.
10. ▻ Step 3: Adjust the start sequence
11. seq _{middle} ← ADJUSTSEQ(start, T)
12.
13. ▻ Step 4: Decompose T and derive the ordered intervals.
14. Interval array I
15. m ← I ▻ I satisfies I _{ m }= T
16.
17. ▻ Step 5: Stochastic Local Search
18. seq ← seq _{middle}
19. for all intervals in the array I _{ w } do
20. l ← startpoint(I _{ w })
21. r ← endpoint(I _{ w })
22. s' ← seq_{[l, r]}▻ get subsequence
23. seq_{[l, r]}LOCALSEARCH(s', I _{ w })
24. end for
25.
26. ▻ Step 6: output
27. if seq _{min} = Cross(seq) then
28. return seq
29. else
30. print Failed!
31. return NIL
32. end if
7.2 Algorithm 7.2  ADJUSTSEQ
Input: the original start sequence start
Input: the target structure T
Output: an initialized sequence seq_{middle}
1. n ← length of T
2. d _{min} ← + ∞, seq _{min} ← start
4. ▻ Step I: generate the set C ^{0}(λ ^{i  1}) via Cross
5. C ^{0}(λ ^{i  1}) ← Cross(λ ^{i  1}, N)
7. if d = 0 then
8. return λ ^{i  1}
9. else if d < d _{min} then
10. d _{min} ← d, seq _{min} ← λ ^{i  1}
11. end if
12.
13. ▻ Step II: generate the competitor set C(λ ^{i  1})
14. C ^{1}(λ ^{i1}) ← ϕ
15. for all ∈ C ^{1}(λ ^{i1}) do
18. end for
19. end for
20. C(λ ^{i  1}) =
22.
23. ▻ Step III: mutation
24. seq ← λ ^{i  1}
25. for w = 1 to n do
26. if ∃ C _{ h }(λ ^{i1}) ∈ C(λ ^{i1}) s.t. p(C _{ h }, w) ≠ p(T, w) then
27. seq[w] ← random nucleotide or pair, s.t. seq ∈ C[T] and seq ∉ C[C _{ h }(λ ^{i1})]
28. end if
29. end for
30. T _{ seq }← Cross(seq)
31. if d(T _{ seq }, T) < d _{min} + 5 then
32. seq _{middle} ← seq
33. else if Step III run less than 5 times then
34. goto Step III
35. end if
36. end for ▻ loop to line 3
37.
38. return seq _{middle}
7.3 Algorithm 7.3  LOCALSEARCH
Input: seq _{middle}
Input: the target T
Output: seq
Ensure: Cross(seq) = T
1. seq ← seq _{middle}
2. if Cross(seq) = T then
3. return seq
4. end if
5. decompose T and derive the ordered intervals
6. I ← [I _{1}, I _{2}, ..., I _{ m }]
7. for all I_{ w } in I do
8. ▻ Phase I: Identify positions
10.
13.
14. ▻ Phase II: Test and Update
15. for all p in U _{1} do
16. random T compatible mutate seq _{ p }
17. end for
18. for all [p, q] in U _{2} do
19. random T compatible mutate seq _{ p }
20. end for
21.
22. E ← ϕ
23. for all p ∈ U _{1}, U _{2} do
24. d ← d(T, Cross(seq _{ p }))
25. if d < d _{ min } then
26. d _{min} ← d, seq ← seq _{ p }
27. goto Phase I
28. else if d _{ min }< d < d _{ min }+ 5 then
29. goto Phase I with the probability 0.1
30. end if
31. if d = d _{ min } then
32. E ← E ∪ {seq}
33. end if
34. end for
35. seq ← e _{0} ∈ E, where e _{0} has the lowest mfe in E
36. if Phase I run less than 10 n times then
37. goto Phase I
38. end if
39. end for
40. return seq
8 Acknowledgements
We are grateful to Fenix W.D. Huang for discussions. Special thanks belongs to the two anonymous referee's whose thoughtful comments have greatly helped in deriving an improved version of the paper. This work was supported by the 973 Project, the PCSIRT of the Ministry of Education, the Ministry of Science and Technology, and the National Science Foundation of China.
Declarations
Authors’ Affiliations
References
 Westhof E, Jaeger L: RNA pseudoknots. Curr Opin Struct Biol. 1992, 2 (3): 327333. 10.1016/0959440X(92)90221RView ArticleGoogle Scholar
 Loria A, Pan T: Domain structure of the ribozyme from eubacterial ribonuclease P. RNA. 1996, 2: 551563.PubMedPubMed CentralGoogle Scholar
 Staple DW, Butcher SE: Pseudoknots: RNA structures with diverse functions. PLoS Biol. 2005, 3 (6): e213 10.1371/journal.pbio.0030213PubMedPubMed CentralView ArticleGoogle Scholar
 Konings DA, Gutell RR: A comparison of thermodynamic foldings with comparatively derived structures of 16S and 16Slike rRNAs. RNA. 1995, 1: 559574.PubMedPubMed CentralGoogle Scholar
 Tuerk C, MacDougal S, Gold L: RNA pseudoknots that inhibit human immunodeficiency virus type 1 reverse transcriptase. Proc Natl Acad Sci USA. 1992, 89 (15): 69886992. 10.1073/pnas.89.15.6988PubMedPubMed CentralView ArticleGoogle Scholar
 Chamorro A, Manko VS, Denisova TE: New exact solution for the exterior gravitational field of a charged spinning mass. Phys Rev D. 1991, 44 (10): 31473151. 10.1103/PhysRevD.44.3147View ArticleGoogle Scholar
 Lyngsø RB, Pedersen CNS: RNA pseudoknot prediction in energybased models. J Comput Biol. 2000, 7 (34): 409427. 10.1089/106652700750050862PubMedView ArticleGoogle Scholar
 Smith TF, Waterman MS: RNA secondary structure: A complete mathematical analysis. Math Biol. 1978, 42: 257266.Google Scholar
 Waterman MS, Smith TF: Rapid dynamic programming methods for RNA secondary structure. Adv Appl Math. 1986, 7 (4): 455464. 10.1016/01968858(86)900254View ArticleGoogle Scholar
 Zuker M, Stiegler P: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucl Acids Res. 1981, 9: 133148. 10.1093/nar/9.1.133PubMedPubMed CentralView ArticleGoogle Scholar
 Nussinov B, Jacobson AB: Fast algorithm for predicting the secondary structure of singlestranded RNA. Proc Natl Acad Sci USA. 1980, 77 (11): 63096313. 10.1073/pnas.77.11.6309PubMedPubMed CentralView ArticleGoogle Scholar
 Fresco JR, Alberts BM, Doty P: Some molecular details of the secondary structure of ribonucleic acid. Nature. 1960, 188: 98101. 10.1038/188098a0PubMedView ArticleGoogle Scholar
 Jun IT, Uhlenbeck OC, Levine MD: Estimation of Secondary Structure in Ribonucleic Acids. Nature. 1971, 230 (5293): 362367. 10.1038/230362a0View ArticleGoogle Scholar
 Delisi C, Crothers DM: Prediction of RNA secondary structure. Proc Natl Acad Sci USA. 1971, 68 (11): 26822685. 10.1073/pnas.68.11.2682PubMedPubMed CentralView ArticleGoogle Scholar
 Rivas E, Eddy SR: A dynamic programming algorithm for RNA structure prediction including pseudoknots. J Mol Biol. 1999, 285 (5): 20532068. 10.1006/jmbi.1998.2436PubMedView ArticleGoogle Scholar
 Dirks RM, Lin M, Winfree E, Pierce NA: Paradigms for computational nucleic acid design. Nucleic Acids Res. 2004, 32 (4): 13921403. 10.1093/nar/gkh291PubMedPubMed CentralView ArticleGoogle Scholar
 Reeder J, Giegerich R: Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics. BMC Bioinformatics. 2004, 5 (104): 20532068.Google Scholar
 Ren J, Rastegari B, Condon A, Hoos H: Hotkonts: Heuristic prediction of RNA secondary structures including pseudoknots. RNA. 2005, 15: 14941504. 10.1261/rna.7284905View ArticleGoogle Scholar
 Huang FWD, Peng WWJ, Reidys CM: Folding 3noncrossing RNA pseudoknot structures. J Comp Biol. 2009, 16 (11): 154975. 10.1089/cmb.2008.0194View ArticleGoogle Scholar
 Borer PN, Dengler B, Tinoco JI, Uhlenbeck OC: Stability of ribonucleic acid doublestranded helices. J Mol Biol. 1974, 86 (4): 843853. 10.1016/00222836(74)90357XPubMedView ArticleGoogle Scholar
 Papanicolaou C, Gouy M, Ninio J: An energy model that predicts the correct folding of both the tRNA and the 5S RNA molecules. Nucleic Acids Res. 1984, 12: 3144. 10.1093/nar/12.1Part1.31PubMedPubMed CentralView ArticleGoogle Scholar
 Turner DH, Sugimoto N, Freier SM: RNA structure prediction. Ann Rev Biophys Biophys Chem. 1988, 17: 167192. 10.1146/annurev.bb.17.060188.001123View ArticleGoogle Scholar
 Walter AE, Turner DH, Kim J, Lyttle MH, Muller P, Mathews DH, Zuker M: Coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of RNA folding. Proc Natl Acad Sci USA. 1994, 91 (20): 92189222. 10.1073/pnas.91.20.9218PubMedPubMed CentralView ArticleGoogle Scholar
 Xia T, SantaLucia JJ, Burkard ME, Kierzek R, Schroeder SJ, Jiao X, Cox C, Turner DH: Thermodynamic parameters for an expanded nearestneighbor model for formation of RNA duplexes with WatsonCrick base pairs. Biochemistry. 1998, 37 (42): 1471913735. 10.1021/bi9809425PubMedView ArticleGoogle Scholar
 Waterman MS: Combinatorics of RNA hairpins and cloverleaves. Stud Appl Math. 1979, 60: 9196.View ArticleGoogle Scholar
 D Kleitman BR: The number of finite topologies. Proc Amer Math Soc. 1970, 25: 276282. 10.1090/S00029939197002539449View ArticleGoogle Scholar
 Jin EY, Qin J, Reidys CM: Combinatorics of RNA structures with pseudoknots. Bull Math Biol. 2008, 70: 4567. 10.1007/s115380079240yPubMedView ArticleGoogle Scholar
 Jin EY, Reidys CM: Combinatorial Design of Pseudoknot RNA. Adv Appl Math. 2009, 42 (2): 135151. 10.1016/j.aam.2008.06.003View ArticleGoogle Scholar
 Chen WYC, Han HSW, Reidys CM: Random knoncrossing RNA Structures. Proc Natl Acad Sci USA. 2009, 106 (52): 2206122066. 10.1073/pnas.0907269106PubMedPubMed CentralView ArticleGoogle Scholar
 Stadler PF: RNA Structures with PseudoKnots. Bull Math Biol. 1999, 61: 437467. 10.1006/bulm.1998.0085PubMedView ArticleGoogle Scholar
 Ma G, Reidys CM: Canonical RNA Pseudoknot Structures. J Comput Biol. 2008, 15 (10): 12571273. 10.1089/cmb.2008.0121PubMedView ArticleGoogle Scholar
 Huang FWD, Reidys CM: Statistics of canonical RNA pseudoknot structures. J Theor Biol. 2008, 253 (3): 570578. 10.1016/j.jtbi.2008.04.002PubMedView ArticleGoogle Scholar
 Reidys CM, Stadler PF, Schuster P: Generic properties of combinatory maps: neutral networks of RNA secondary structures. Bull Math Biol. 1997, 59 (2): 339397. 10.1007/BF02462007PubMedView ArticleGoogle Scholar
 Reidys CM: Local connectivity of neutral networks. Bull Math Biol. 2008, 71 (2): 265290. 10.1007/s1153800893568PubMedView ArticleGoogle Scholar
 Hofacker I, Fontana W, Stadler P, Bonhoeffer L, Tacker M, Schuster P: Fast folding and comparison of RNA secondary structures. Chem Month. 1994, 125 (2): 167188. 10.1007/BF00818163View ArticleGoogle Scholar
 Andronescu M, Fejes AP, Hutter F, Hoos HH, A C: A New Algorithm for RNA Secondary Structure Design. J Mol Biol. 2004, 336 (2): 607624. 10.1016/j.jmb.2003.12.041PubMedView ArticleGoogle Scholar
 Busch A, Backofen R: INFORNAa fast approach to inverse RNA folding. Bioinformatics. 2006, 22 (15): 18231831. 10.1093/bioinformatics/btl194PubMedView ArticleGoogle Scholar
 Jin EY, Reidys CM: Central and local limit theorems for RNA structures. J Theor Biol. 2008, 253 (3): 547559. 10.1016/j.jtbi.2007.09.020View ArticleGoogle Scholar
 PseudoBase. http://www.ekevanbatenburg.nl/PKBASE/PKBGETCLS.HTML
 The pseudoknot structure of the glmS ribozyme pseudoknot P1.1. http://www.ekevanbatenburg.nl/PKBASE/PKB00276.HTML
 Pseudoknot PKI of the internal ribosomal entry site (IRES) region. http://www.ekevanbatenburg.nl/PKBASE/PKB00221.HTML
 The pseudoknot of SELEXisolated inhibitor (ligand 70.28) of HIV1 reverse transcriptase. http://www.ekevanbatenburg.nl/PKBASE/PKB00066.HTML
 Pseudoknot PK2 of E.coli tmRNA. http://www.ekevanbatenburg.nl/PKBASE/PKB00050.HTML
 Pineapple mealybug wilt associated virus  2. http://www.ekevanbatenburg.nl/PKBASE/PKB00270.HTML
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.