The standard free energy changes

$\Delta {G}_{i,j}^{0}$ and

*Δ* *G*^{0} are estimated using the Turner nearest-neighbor rules [

11,

18,

37], and pseudoknots are excluded. In that case,

$\Delta {G}_{i,j}^{0}$ depends on two independent contributions, one for the interior fragment containing the pair

*i*,

*j* and bases in between (but excluding bases from the 5’ end to

*i*−1 and from

*j*+1 to the 3’ end), and one for the exterior fragment containing the pair and bases from the two ends (but excluding bases from

*i*+1 to

*j*−1). We define

*V*_{i,j} to be the relative standard free energy change for the interior fragment in the case that

*i*<

*j* and for the exterior fragment otherwise, following the convention used in the mfold prediction software [

38]. In that case,

$\Delta {G}_{i,j}^{0}=-\mathit{\text{RT}}log\left[exp\left(-{V}_{i,j}/\mathit{\text{RT}}\right)+exp\left(-{V}_{j,i}/\mathit{\text{RT}}\right)\right]$

(3)

where

$\begin{array}{l}{Q}_{i,j}^{\text{hairpin}}\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left(\Delta {G}_{i,j}^{\text{hairpin}}\right)\end{array}$

(6)

$\begin{array}{l}{Q}_{i,j}^{\text{stack}}\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({V}_{i+1,j-1}+\Delta {G}_{i,j,i+1,j-1}^{\text{stack}}\right)\end{array}$

(7)

$\begin{array}{l}{Q}_{i,j}^{\text{internal}}\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}\sum _{i<{i}^{\prime}<{j}^{\prime}<j}w\left({V}_{{i}^{\prime},{j}^{\prime}}+\Delta {G}_{i,j,{i}^{\prime},{j}^{\prime}}^{\text{internal}}\right)\end{array}$

(8)

$\begin{array}{l}{Q}_{i,j}^{\text{multibranch}}\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i+1,j-1}^{\text{MB}}+a+c\right)+w\left({W}_{i+2,j-1}^{\text{MB}}+\Delta {G}_{i,j,i+1}^{{3}^{\prime}\mathit{\text{dangle}}}+a+b+c\right)+\\ \phantom{\rule{9em}{0ex}}w\left({W}_{i+1,j-2}^{\text{MB}}+\Delta {G}_{i,j,j-1}^{{5}^{\prime}\mathit{\text{dangle}}}+a+b+c\right)+\\ \phantom{\rule{9em}{0ex}}w\left({W}_{i+2,j-2}^{\text{MB}}+\Delta {G}_{i,j,i+1,j-1}^{\mathrm{terminal}\phantom{\rule{1em}{0ex}}\mathit{\text{mismatch}}}+a+2b+c\right)+\\ \phantom{\rule{9em}{0ex}}\sum _{i<k<j}w\left({V}_{i+1,k}+{Y}_{k+1,j-1}+\Delta {G}_{j,i,i+1,k}^{\text{coaxial flush}}+a+2c\right)+\\ \phantom{\rule{9em}{0ex}}\sum _{i<k<j}w\left({V}_{i+2,k}+{Y}_{k+2,j-1}+\Delta {G}_{j,i,i+2,k}^{\text{coaxial mismatch(2)}}+a+2b+2c\right)+\\ \phantom{\rule{9em}{0ex}}\sum _{i<k<j}w\left({V}_{i+2,k}+{Y}_{k+1,j-2}+\Delta {G}_{j,i,i+2,k}^{\text{coaxial mismatch(1)}}+a+2b+2c\right)+\\ \phantom{\rule{9em}{0ex}}\sum _{i<k<j}w\left({V}_{k,j-1}+{Y}_{i+1,k-1}+\Delta {G}_{k,j-1,j,i}^{\text{coaxial flush}}+a+2c\right)+\\ \phantom{\rule{9em}{0ex}}\sum _{i<k<j}w\left({V}_{k,j-2}+{Y}_{i+1,k-2}+\Delta {G}_{k,j-2,j,i}^{\text{coaxial mismatch(1)}}+a+2b+2c\right)+\\ \phantom{\rule{9em}{0ex}}\sum _{i<k<j}w\left({V}_{k,j-2}+{Y}_{i+2,k-1}+\Delta {G}_{k,j-2,j,i}^{\text{coaxial mismatch(2)}}+a+2b+2c\right)\end{array}$

(9)

$\begin{array}{l}{Q}_{i,j}^{\text{exterior}}\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i+1}^{{3}^{\prime}}+{W}_{j-1}^{{5}^{\prime}}\right)+w\left({W}_{i+2}^{{3}^{\prime}}+{W}_{j-1}^{{5}^{\prime}}+\Delta {G}_{i,j,i+1}^{{3}^{\prime}\mathit{\text{dangle}}}+\right)+\\ \phantom{\rule{6.7em}{0ex}}w\left({W}_{i+1}^{{3}^{\prime}}+{W}_{j-2}^{{5}^{\prime}}+\Delta {G}_{i,j,j-1}^{{5}^{\prime}\mathit{\text{dangle}}}+\right)+w\left({W}_{i+2}^{{3}^{\prime}}+{W}_{j-2}^{{5}^{\prime}}+\Delta {G}_{i,j,i+1,j-1}^{\mathrm{terminal}\phantom{\rule{1em}{0ex}}\mathit{\text{mismatch}}}\right)+\\ \phantom{\rule{6.7em}{0ex}}\sum _{i<k<j}w\left({V}_{i+1,k}+{W}_{k+1}^{{3}^{\prime}}+{W}_{j-1}^{{5}^{\prime}}+\Delta {G}_{j,i,i+1,k}^{\text{coaxial flush}}\right)+\\ \phantom{\rule{6.7em}{0ex}}\sum _{i<k<j}w\left({V}_{i+2,k}+{W}_{k+2}^{{3}^{\prime}}+{W}_{j-1}^{{5}^{\prime}}+\Delta {G}_{j,i,i+2,k}^{\text{coaxial mismatch(2)}}\right)+\\ \phantom{\rule{6.7em}{0ex}}\sum _{i<k<j}w\left({V}_{i+2,k}+{W}_{k+1}^{{3}^{\prime}}+{W}_{j-2}^{{5}^{\prime}}+\Delta {G}_{j,i,i+2,k}^{\text{coaxial mismatch(1)}}\right)+\\ \phantom{\rule{6.7em}{0ex}}\sum _{i<k<j}w\left({V}_{k,j-1}+{W}_{i+1}^{{3}^{\prime}}+{W}_{k-1}^{{5}^{\prime}}+\Delta {G}_{k,j-1,j,i}^{\text{coaxial flush}}\right)+\\ \phantom{\rule{6.7em}{0ex}}\sum _{i<k<j}w\left({V}_{k,j-2}+{W}_{i+1}^{{3}^{\prime}}+{W}_{k-2}^{{5}^{\prime}}+\Delta {G}_{k,j-2,j,i}^{\text{coaxial mismatch(1)}}\right)+\\ \phantom{\rule{6.7em}{0ex}}\sum _{i<k<j}w\left({V}_{k,j-2}+{W}_{i+2}^{{3}^{\prime}}+{W}_{k-1}^{{5}^{\prime}}+\Delta {G}_{k,j-2,j,i}^{\text{coaxial mismatch(2)}}\right)\end{array}$

(10)

$\begin{array}{l}w\left({W}_{i,j}^{\mathrm{L}}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i+1,j}^{\mathrm{L}}+b\right)+w\left({V}_{i,j}+c\right)+w\left({V}_{i,j-1}+\Delta {G}_{j-1,i,j}^{{3}^{\prime}\mathit{\text{dangle}}}+b+c\right)+\\ \phantom{\rule{6em}{0ex}}w\left({V}_{i+1,j}+\Delta {G}_{j,i+1,i}^{{5}^{\prime}\mathit{\text{dangle}}}+b+c\right)+w\left({V}_{i+1,j-1}+\Delta {G}_{j-1,i+1,j,i}^{\mathrm{terminal}\phantom{\rule{1em}{0ex}}\mathit{\text{mismatch}}}+2b+c\right)\end{array}$

(11)

$\begin{array}{l}w\left({W}_{i,j}^{\mathrm{Q}}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({V}_{i,j}\right)+w\left({V}_{i,j-1}+\Delta {G}_{j-1,i,j}^{{3}^{\prime}\mathit{\text{dangle}}}\right)+w\left({V}_{i+1,j}+\Delta {G}_{j,i+1,i}^{{5}^{\prime}\mathit{\text{dangle}}}\right)+\\ \phantom{\rule{6em}{0ex}}w\left({V}_{i+1,j-1}+\Delta {G}_{j-1,i+1,j,i}^{\mathrm{terminal}\phantom{\rule{1em}{0ex}}\mathit{\text{mismatch}}}\right)\end{array}$

(12)

$\begin{array}{l}w\left({W}_{i,j}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i,j-1}+b\right)+w\left({W}_{i,j}^{\mathrm{L}}\right)\end{array}$

(13)

$\begin{array}{l}w\left({W}_{i,j}^{\text{coax}}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}\sum _{i<k<j}w\left({V}_{i,k}+{V}_{k+1,j}+\Delta {G}_{i,k,k+1,j}^{\text{coaxial flush}}+2c\right)+\\ \phantom{\rule{6.8em}{0ex}}\sum _{i<k<j}w\left({V}_{i+1,k}+{V}_{k+2,j}+\Delta {G}_{i+1,k,\mathit{\text{kj}}+2,j}^{\text{coaxial mismatch(1)}}+2b+2c\right)+\\ \phantom{\rule{6.8em}{0ex}}\sum _{i<k<j}w\left({V}_{i,k}+{V}_{k+2,j-1}+\Delta {G}_{i,k,k+2,j-1}^{\text{coaxial mismatch(2)}}+2b+2c\right)\end{array}$

(14)

$\begin{array}{l}w\left({Z}_{i,j}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i,j}^{\text{coax}}\right)+w\left({V}_{i,j}+c\right)+w\left({V}_{i,j-1}+\Delta {G}_{j-1,i,j}^{{3}^{\prime}\mathit{\text{dangle}}}+b+c\right)+\\ \phantom{\rule{5.6em}{0ex}}w\left({V}_{i+1,j}+\Delta {G}_{j,i+1,i}^{{5}^{\prime}\mathit{\text{dangle}}}+b+c\right)+w\left({V}_{i+1,j-1}+\Delta {G}_{j-1,i+1,j,i}^{\mathrm{terminal}\phantom{\rule{1em}{0ex}}\mathit{\text{mismatch}}}+2b+c\right)\end{array}$

(15)

$\begin{array}{l}w\left({W}_{i,j}^{\text{MBL}}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i+1,j}^{\text{MBL}}\right)+w\left({W}_{i,j}^{\text{coax}}\right)+\sum _{i<k<j}w\left({Z}_{i,k}+{Y}_{k+1,j}^{\mathrm{L}}\right)\end{array}$

(16)

$\begin{array}{l}w\left({W}_{i,j}^{\text{MB}}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i,j-1}^{\text{MB}}+b\right)+w\left({W}_{i,j}^{\text{MBL}}\right)\end{array}$

(17)

$\begin{array}{l}w\left({Y}_{i,j}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i,j}\right)+w\left({W}_{i,j}^{\text{MB}}\right)\end{array}$

(18)

$\begin{array}{l}w\left({Y}_{i,j}^{\mathrm{L}}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i,j}^{\mathrm{L}}\right)+w\left({W}_{i,j}^{\text{MBL}}\right)\end{array}$

(19)

$\begin{array}{l}w\left({W}_{i}^{{5}^{\prime}}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i-1}^{{5}^{\prime}}\right)+\sum _{j<i}w\left({W}_{j-1}^{{5}^{\prime}}+{W}_{j,i}^{\mathrm{Q}}\right)\end{array}$

(20)

$\begin{array}{l}w\left({W}_{i}^{{3}^{\prime}}\right)\phantom{\rule{2pt}{0ex}}=\phantom{\rule{2pt}{0ex}}w\left({W}_{i+1}^{{3}^{\prime}}\right)+\sum _{j>i}w\left({W}_{j+1}^{{3}^{\prime}}+{W}_{i,j}^{\mathrm{Q}}\right)\end{array}$

(21)

These recursions are slightly different from—but equivalent to—those presented in reference [20] and used in the previous code. It should be noted that there was an error in equation 15 of reference [20]: in the second line, WMBL(k+1,j) should be replaced by [WMBL(k+1,j) + WL(k+1,j)].

Reorganizing the recursions in this way might appear to use more memory because of the additional arrays. In fact, the modified version requires less memory, because several of the arrays do not need to be stored in their entirety. Specifically, using the modified recursions, storage is only required for two diagonals of *W*, *W*^{L}, and *W*^{MBL}; for five diagonals of *W*^{MB}; and for a half-triangle of *W*^{Q}. Reducing memory usage is important as the size of the full arrays scales as *O*(*N*^{2}) and the available GPU memory on our hardware was limited to ∼2.5 GB. The modified recursions use four full *N*×*N* arrays and one half-triangle, rather than the six full *N*×*N* arrays used in the original recursions, and therefore reduce memory usage by about 25%. In addition, the calculation of *W*^{5′} and *W*^{3′} is simplified (compare equations 20 and 21 above with equation 11 of reference [20]).

In order to determine how much additional computational overhead was imposed by the calculation of exp and log1p we performed a comparison with an artificial reference calculation, which was identical except that calls to these functions were omitted. We found that for a 1,000-mer, the actual GPU calculation is only ∼20% more expensive than this reference calculation.

For a serial calculation on the CPU, there is a larger performance hit; the actual calculation is about a factor of two more expensive than the reference without exp or log1p. However, it should be noted that this is not the entire story, because overall, the new optimized serial code, which uses logarithms, is still faster than the original code, which does not. Running the calculation in log space results in simplifications such as not requiring checking for overflow and not having to multiply by scaling factors, which reduces computational expense.