Efficient privacy-preserving variable-length substring match for genome sequence

The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely searches a variable-length substring match between a query and a database sequence. Our concept hinges on a technique that efficiently applies FM-index for a secret-sharing scheme. More precisely, we developed an algorithm that can achieve a secure table lookup in such a way that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$V[V[\ldots V[p_0] \ldots ]]$$\end{document}V[V[…V[p0]…]] is computed for a given depth of recursion where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_0$$\end{document}p0 is an initial position, and V is a vector. We used the secure table lookup for vectors created based on FM-index. The notable feature of the secure table lookup is that time, communication, and round complexities are not dependent on the table length N, after the query input. Therefore, a substring match by reference to the FM-index-based table can also be conducted independently against the database length, and the entire search time is dramatically improved compared to previous approaches. We conducted an experiment using a human genome sequence with the length of 10 million as the database and a query with the length of 100 and found that the query response time of our protocol was at least three orders of magnitude faster than a non-indexed database search protocol under the realistic computation/network environment.


Introduction
The dramatic reduction in the cost of genome sequencing has prompted increased interest in personal genome sequencing over the last 15 years. Extensive collections of personal genome sequences have been accumulated both in academic and industrial organizations, and there is now a global demand for sharing the data to accelerate scientific research [1,2]. As discussed in previous studies, disclosing personal genome information has a high privacy risk [3], so it is crucial to ensure that individuals' privacy is protected upon data sharing. At present, the most popular approach for this is to formulate and enforce a privacy policy, but it is a time-consuming process to reach an agreement, especially among stakeholders with different legal backgrounds, which slows down the pace of research. Therefore, there is a strong demand for privacy-preserving technologies that can potentially compensate for or even replace the traditional policy-based approach [4,5]. One important application that needs a privacy-preserving technology is private genome sequence search, where different stakeholders respectively hold a query sequence and a database sequence and the goal is to let the query holder know the result while simultaneously keeping the query and the database private. Many studies have addressed the problem of how to compute exact or approximate edit distance or the longest common substring (LCS) through techniques based on homomorphic encryption [6][7][8] and secure multi-party computation (MPC) [9][10][11][12][13][14][15], or how to compute sequence similarity based on private set intersection [16]. While these studies can evaluate global sequence similarity for two sequences of similar length,  17:9 other studies address the problem of finding a substring between a query and a long genome sequence or a set of long genome sequences, with the aim of evaluating local sequence similarity [17][18][19][20][21][22][23]. Shimizu et al. proposed an approach to combine an additive homomorphic encryption and index structures such as FM-index [24] and the positional Burrows-Wheeler transform [25] to find the longest prefix of a query that matches a database (LPM) and a set-maximal match for a collection of haplotypes [17]. Sudo et al. used a similar approach and improved the time and communication complexities for LPM on a protein sequence by using a wavelet matrix [19]. Ishimaki et al. improved the round complexity of a set-maximal match, though the search time was more than one order of magnitude slower than [17] due to the heavy computational cost caused by the fully homomorphic encryption [18]. Sotiraki et al. used the Goldreich-Micali-Wigderson protocol to build a suffix tree for a set-maximal match [20]. According to experiments by [21], the search time of [20] is one order of magnitude slower than [17,21]. Mahdi et al. [21] used a garbled circuit to build a suffix tree for substring match and a set-maximal match under a different security assumption such that the tree-traversal pattern is leaked to the cloud server. Chen et al. [22] and Popic et al. [23] found fixed-length substring matches using a one-way hash function or homomorphic encryption on a public cloud under a security assumption such that the database is a public sequence and a query is leaked to a private cloud server.
In this study, we aim to improve privacy-preserving substring match under the security assumption such that both the query and the database sequence are strictly protected. We first propose a more efficient method for finding LPM, and then extend it to find the longest maximal exact match (LMEM), which is more practically important in bioinformatics. We designed the protocol for LMEM for ease of explanation, and the protocol can be applied to similar problems such as finding all maximal exact matches (MEMs) with a small modification. To our knowledge, this is the first study to address the problem of securely finding MEMs.

Our contribution
The time complexity of the previous studies [17,19] include the factor of N , and thus they do not scale well to a large database. For a similar reason, using secure matching protocols (e.g., [26]) for the shares (or tags in searchable encryption) of all substrings in a query and database is even worse in terms of time complexity. To achieve a real-time search on an actual genome database, we propose novel secret-sharing-based protocols that do not include the factor of N in the time, communication, and round complexities for the search time (i.e., the time after the input of a query until the end of the search).
The basic idea of the protocols is to represent the database string by a compressed index [24,27] and store the index as a lookup table. LPM and MEMs are found by at most ℓ and 2ℓ table lookups respectively, where ℓ is the length of the query. More specifically, the table V is referenced in a recursive manner; i.e., one needs to obtain V [j] , where j = V [i] , given i. To ensure security, we need to compute V [j] without seeing any element of V . The key technical contribution of this study is an efficient protocol that achieves this type of recursive reference. We named the protocol secret-shared recursive oblivious transfer (ss-ROT). While the previous studies require O(N ) time complexity to ensure security, the time, communication, and round complexities of ss-ROT are all O(ℓ) for ℓ recursive table lookups, except for the preparation of the table and generation of shares before the query input. Since the entire protocols mainly consist of ℓ table lookups for LPM, and 2ℓ table lookups and 2ℓ inner product computations for LMEM, the search times for LPM and LMEM do not depend on the database size. In addition to the protocols based on ss-ROT, we developed a protocol to reduce data transfer size in the initial step by using a similar approach taken in ss-ROT. The protocol offers a reasonable trade-off between the amount of reduction in data transfer in the initial step and the increase in computational cost in the later step.
We implemented the proposed protocol and tested it on substrings of a human genome sequence 10 3 to 10 7 in length and confirmed that the actual CPU time and data transfer overhead were in good agreement with the theoretical complexities. We also found that the search time of our protocol was three orders of magnitude faster than that of the previous method [17,19]. For conducting further performance analysis, we designed and implemented baseline protocols using major techniques of secret-sharing-based protocols. The results showed that the search times of our protocols were at least two orders of magnitude faster than those of the baseline protocols.
Secret sharing and secure computation In t-out-ofn secret sharing (e.g., [28]), we split the secret value x into n pieces, and can reconstruct x by combining more or an equal number of t pieces. We call the split pieces "share". The basic security notion for secret sharing is that we cannot obtain any information about x even if we gather less than or equal to (t − 1) shares. In this paper, we consider a case with (t, n) = (2, 2) . A 2-outof-2 secret sharing ((2, 2)-SS) scheme over Z 2 n consists of two algorithms: i , we consider power-of-two integers n (e.g., n = 16 ) and n = 1 , respectively.
Depending on the secret sharing scheme, we can compute arithmetic/Boolean gates over shares; that is, we can execute some kind of processing related to x without x. This means it is possible to perform some computation without violating the privacy of the secret data, and is called secure (multi-party) computation. It is known that we can execute arbitrary computation by combining basic arithmetic/Boolean gates. In the following paragraphs, we show how to concretely compute these gates over shares.
Semi-honest secure two-party computation based on (2, 2)-Additive SS We use a standard (2, 2)-additive SS scheme, defined by  [29]. Such a triple consists of bt 0 = (a 0 , b 0 , c 0 ) and bt 1 = (a 1 , b 1 , c 1 ) such that (a 0 + a 1 )(b 0 + b 1 ) = (c 0 + c 1 ) . Hereafter, a, b, and c denote a 0 + a 1 , b 0 + b 1 , and c 0 + c 1 , respectively. We use these BTs as auxiliary inputs for computing MULT . Note that we can compute them in advance (or in offline phase) since they are independent of inputs [[x]] and [[y]] . We adopt a trusted initializer setting (e.g., [30,31]); that is, BTs are generated by the party other than two computing nodes and then distributed. In the online phase of MULT , each i-th party    . In this paper, we consider the standard simulation-based security notion in the presence of semi-honest adversaries (for 2PC), as in [32]. We show the definition in Appendix 2. Roughly speaking, this security notion guarantees the privacy of the secret under the condition that computing nodes do not deviate from the protocol; that is, although computing nodes are allowed to execute arbitrary attacks in their local, they do not (maliciously) manipulate transmission data to other parties. The building blocks we adopt in this paper satisfy this security notion. Moreover, as described in [32], the composition theorem for the semi-honest model holds; that is, any protocol is privately computed as long as its subroutines are privately computed.

Index structure for string search
Notation and definition denotes a set of ordered symbols. A string consists of symbols in . We denote a lexicographical order of two strings S and S ′ by S ≤ S ′ (i.e., A < C < G < T and AAA < AAC). We denote the i-th letter of a string S by S[i] and a substring starting from the ith letter to the j-th letter by S [i, j]. The index starts with 0. The length of S is denoted by |S|. A reverse string of S (i.e., S[|S| − 1], . . . , S[0] ) is denoted by Ŝ . We consider a direction from the i-th position to the j-th position as rightward if i < j and leftward otherwise.
Given a query w and a database S, we define the longest prefix that matches a database string (LPM) by max (0,j) {j|w[0, . . . , j] = S[k, . . . , l]} , where 0 ≤ j < ℓ and 0 ≤ k ≤ l < N , and the longest maximal exact match FM-Index and related data structures FM-Index [24] and related data structures [27] are widely used for genome sequence search. Given a query string w of length ℓ and a database string S of length N, [24] enables LPM to be found in O(ℓ) time regardless of N, and it also enables LMEM to be found in O(ℓ) if auxiliary data structures are used [27]. Given all the suffixes of a string S:

Input Output
Equality O(ℓ 2 ) time. We can improve it to O(ℓ) time by using the longest common prefix (LCP) array and related data structures [27]. The LCP array, denoted by LCP , is an array that stores the length of the longest prefix By finding a parent interval using PSV and NSV whenever it fails to extend the match, we can avoid useless backward searches, and thus LMEM is found at most 2ℓ backward searches. LCP , PSV and NSV are precomputed and stored in an efficient form that can be searched in constant time, so we can find LMEM in O(ℓ) time. See section 5.2 of [27] for more details of the data structures. Examples of the search by FM-Index, LCP , PSV , and NSV are provided in Appendix 1.

Problem setting and outline of our protocols
We assume that a query holder A , a database holder B , and two computing nodes P 0 and P 1 participate the protocol. A holds a query string w of length ℓ and B holds a database string T of length N . After the protocol is run, only A knows LPM or LMEM between w and T . P 0 and P 1 do not obtain any information of w and T , except for ℓ and N.
Our protocol consists of offline, DB preparation, and Search phases. In the offline phase, B generates BTs (correlated randomness used for multiplication) and sends them to P 0 and P 1 . In the DB preparation phase, B creates a lookup table and distributes its shares to P 0 and P 1 . In the Search phase, A generates shares of the query and sends them to P 0 and P 1 , and P 0 and P 1 jointly compute the result without obtaining any information of the lookup table. Finally, A obtains the results. Figure 2 shows the schematic view of our goal and model. Note that the offline and DB preparation phases do not depend on a query string, so they can be computed in advance for multiple queries.
In section "Secret-shared recursive oblivious transfer", we propose the important building block ss-ROT that enables recursive reference to a lookup table. In section "Secure LPM", we describe how to design the lookup table based on FM-Index, and propose an efficient protocol for LPM by using the lookup table and ss-ROT. In section "Secure LMEM", we describe the additional table design for auxiliary data structures, and propose the complete protocol for LMEM. Table 2 summarizes the theoretical complexities of the three protocols. For comparison, the complexities of the baseline protocols and a previous method for LPM based on an additive homomorphic encryption [17,19] are shown. As we mentioned in section "Introduction", the baseline protocols are designed using major techniques of secret-sharingbased protocols. The detailed algorithms are described in Appendix 3.

Secret-shared recursive oblivious transfer
We define a problem called a secret-shared recursive oblivious transfer (ss-ROT) as follows. For simplicity, we denote the recursion of Eq. 3 by In our protocol, all the random values are uniformly generated from Z 2 n.
Search phase The Search phase consists of two steps and is described in Lines 2-5 of Protocol 1. The input is the initial position p 0 and shares of R. The output is ] . An example of a search is illustrated in Fig. 3.

Security intuition
In the DB preparation phase of ss-ROT, B does not disclose any private values, and P 0 and P 1 receive the shares. In the Search phase, all the messages exchanged between P 0 and P 1 are shares except for the result of Reconst in Step 1. In the j-th step of the loop in Step 1, Since the reconstructed value is randomized by r j , no information is leaked. Note that for each vector R j , all the elements R j [0], . . . , R j [N − 1] are randomized by the same value r j , but only one of them is reconstructed, and different random numbers r 0 , . . . , r ℓ−1 are used for R 0 , . . . , R ℓ−1 . In Step 2, P 0 and P 1 output a result, and no information other than the result is leaked.  Security Theorem 1 ss-ROT is correct and secure in the semihonest model.
Proof Correctness and security of ss-ROT protocol are proved as follows.
Correctness. We assume the following equation.
In Step1, for j = 0 , the protocol computes p 1 by recon- For j = k , the protocol computes p k+1 by reconstructing R k [p k ] . From the definition of R j [i] in Eq. 4 and the assumption of Eq. 5, Eq. 5 holds for i = 1 by Eq. 6. It also holds for i = k + 1 under the assumption that Eq. 5 holds for i = k . Therefore by induction, Eq. 5 holds for i = 1, . . . , ℓ − 1.
In Step 2, P 0 and The above argument completes the proof of correctness of Theorem 1.
Security. Since the roles of P 0 and P 1 are symmetric, it is sufficient to consider the case when P 0 is corrupted. The Fig. 3 Example of a search when V = (2, 0, 3, 1) , p 0 = 2 , and ℓ = 4 . The goal is to compute [[V (4) [2] ]] = [ [2]] . Here we assume B generates r 0 = 1, r 1 = 2, r 2 = 1 . In Step 1 of Search phase, P 0 and P 1 jointly compute is randomized by r 0 , so any element of V is leaked.) In a similar way, P 0 and P 1 compute input to P 0 is p 0 and ℓ , and output of P 0 is V (ℓ) [p 0 ] . The function achieved by Protocol 1 is deterministic and the protocol is correct. Therefore, to ensure the security of Protocol 1, we need to prove existence of a probabilistic polynomial-time simulator S such that where X is P 0 's view. X consists of: All the messages from B and P 1 are uniformly at random in Z 2 n , as they are generated by Share .
are uniformly at random in Z N from the definition of Eq. 4. Let us denote a random number u chosen from a set U uniformly at random by u R ∈U . We construct S as described in Protocol 2. The output of S is R 0 ∈ Z ℓ×N 2 n , R 1 ∈ Z ℓ 2 n , and p 1 , . . . ,p ℓ−1 . In Line 6 and Line 9, p 1 , . . . ,p ℓ−1 are generated such that they are uniformly at random in Z N . In Line 7, ( j = 1, . . . , ℓ − 1 ) are uniformly at random in Z 2 n by Line 3. Therefore, Eq. 8 holds. By the above discussion, we find our ss-ROT satisfies security in the semi-honest model.  The goal  I.e., starting with the initial interval [f 0 = 0, g 0 = N ) , we can compute the match by recursively referring to the lookup table while f < g.

Protocol overview
The key idea of Secure LPM is to refer to V by ss-ROT, i.e., P 0 and P 1 jointly refer to V ℓ times in a recursive manner. To achieve backward search, P 0 and P 1 need to select V x [·] for each reference, where x is a query letter to be searched with. This is achieved by expressing the query letter by unary code (9) (Eq. 11 ) and computing the inner product of Eq. 11 and ) . To find LPM, P 0 and P 1 need to check f = g for each reference. We use the subprotocol Equality to check it securely. Since V is randomized with different numbers for searching f and g, the difference of the random numbers is precomputed and removed securely upon the equality check.  , and r ′ [j] to P 0 and P 1 .

Search phase Protocol 3 describes the algorithm in detail.
A generates four vectors q A , q C , q G , q T , each of length ℓ , as follows.  Figure 4 illustrates an example of the table lookup.
A generates shares of q A , q C , q G , q T and distributes them to P 0 and P 1 .
becomes lower bound f (for p 0 = 0 ) or upper bound g (for p 0 = N ) of the interval that corresponds to the prefix match of length k. In Line ] in the same manner. Each vector R j c,f in Eq. 10 is generated in the same manner as R j in Eq. 4. Since Eq. 10 uses the common random values r j f and r which is obvious from the correctness of ss-ROT. Therefore, the recursion by Line 5 and Line 7 can and the recursion by Line 6 and Line 8 can also compute The longest match is found when the interval width becomes 0.
to obtain the correct interval width. When the width is 0, d becomes either one of 0, N ′ and −N ′ . Therefore, Line 12 computes the equality d and 0, N ′ and −N ′ respectively. By reconstructing all the results in Lines 16-18, A knows the round, in which the interval width becomes 0; i.e., he/she knows LPM. The above argument completes the proof of correctness of Theorem 2.
Security We only show a sketch of the proof. For Lines 1-2 of Protocol 3, A and B do not disclose any private values, and P 0 and P 1 receive the shares. For Lines 3-14, it is guaranteed by the subprotocols ADD , MULT , and Equality that all the messages exchanged between P 0 and P 1 are shares except for the output of Reconst [g j ] + r j g ) mod N ′ according to Eq. 10, it is obvious that V is randomized for all rounds j = 0, . . . , ℓ − 1 , and no information is leaked. For Lines 14-17, only the output of Equality at Line 11 is reconstructed. The reconstructed values are either 1 or 0 according to Equality , and no information other than the result is leaked.
A may reveal T by making many queries. Such a problem is called output privacy. Although output privacy is outside of the scope of this paper, we should mention here that A needs to make an unrealistically large number of queries for obtaining T by such a brute-force attack, considering that N is very long.

Complexities
The DB preparation phase generates shares of R

Secure LMEM
Construction of lookup table As described in section "Index structure for string search", we can find a parent interval by a reference to LCP , PSV , and NSV . Therefore, in addition to V c defined in section "Secure LPM", we prepare lookup tables that simply store all the outputs of them; i.e., V lcp DB preparation phase B generates randomized vectors R c,f , R c,g and r ′ [j] = (r j f − r j g ) mod N ′ using the same algorithm in section "Secure LPM" for length 2ℓ . As shown in Eq. 2, V lcp is referred by the upper and lower bounds of [f, g). Therefore, B generates following circular permutations of V lcp such that W l,f and R c,f , and W l,g and R c,g , are permutated by the same random values, respectively. I.e., where x is either f or g. V psv is referred by both f and g, and is plugged in to f. Therefore, B generates W Similarly, V nsv is referred by both f and g, and is plugged in to g. Therefore, B generates W j n,f [i] and W j n,g [i] as follows.
B distributes shares of R c,f , R c,g , r ′ , W l,f , W l,g , W p,f , W p,g , W n,f , and W n,g to P 0 and P 1 .
Search phase Protocol 4 describes the algorithm in detail. A generates query vectors q A , q C , q G , q T by Eq. 11 and distributes shares of the vectors to P 0 and P 1 . In Line 6 of Protocol 4, [f ,ĝ) is computed by the reference to R (i.e., a search based on a backward search) similarly to Lines 5-6 of Protocol 3. In Line 11, [f ex , g ex ) is computed by the reference to W (i.e., a search based on LCP , PSV and NSV ). In Line 13, the interval is updated by either [f ,ĝ) or [f ex , g ex ) based on the result of f ′ = g ′ in Lines 7-9, where [f ′ , g ′ ) corresponds to the interval that corresponds to a substring match.
In each round, we need to know a query letter to be searched with, so we need to maintain the right end position of the match in the query. The position moves toward the right while the match is extended, but remains the same when the interval is updated based on PSV and NSV . To memorize the position, we prepare shares of a unit bit vector u of length ℓ , in which the position t is memorized as u[t] = 1 and u[i � = t] = 0 . In Lines 20-23, u remains the same if the interval is updated based on PSV and NSV , and u = (0, otherwise. When the search is finished (e.g., the right end of a match exceeds the right end of the query) u = (0, . . . , 0) . Therefore in Lines 25-28, x = 1 while the right end of a match dose not exceed the right end of the query and x = 1 after finishing the search. In Lines 29-31, the inner product of q c ( c ∈ ) and u becomes the encode of w[t] that is used for the next round.
We also maintain the left end position of the match. While the match is extended, the position remains the same and it moves toward the right when the interval is updated by   ; i.e., either one of V c , V psv and V nsv . Note that V x could be a different table for each j + 1 , but we abuse the same notation for simplicity of notation. Since f j and g j are described in the form of V (j) based on Eq. 5, Eq. 12 and Eq. 13 are transformed into V Security. We only show a sketch of the proof. For Lines 1-2 of Protocol 4, A and B do not disclose any private values, and P 0 and P 1 receive the shares. For Lines 3-37, it is guaranteed by the subprotocols ADD , MULT , Equality , and Choose that all the messages exchanged between P 0 and P 1 are shares except for the output of Reconst in Line 14. (see section "Secure computation based on secret sharing" for details of the subprotocols.) In Line 14, the reconstructed values are [p 0 ] + r j g , according to Eq. 5, Eq. 12, and Eq. 13. Since f j+1 and g j+1 are randomized by r j f and r j g , respectively, for all rounds j = 0, . . . , 2ℓ − 1 , no information is leaked. In Line 38, A reconstructs only the search result (the length and start position of LMEM).

Complexities
The DB preparation phase generates shares of R j c,f and R j c,g ( c ∈ , 0 ≤ j < ℓ ) and W j x,f and W

Reducing size of shares in DB preparation phase
The protocols based on ss-ROT are quite efficient in Search phase, however, they require large data transfer from B to the computing nodes in DB preparation phase when the number of queries and the length of the database are large. To mitigate the problem, we propose another protocol that can reduce size of shares in DB preparation phase. We use two parameters m and n ( m < n ) for computing shares. When Share outputs ( DB preparation phase B computes following vectors for Proof Following equation is equivalent to Eq. 14. If . Then, Search phase A generates table w for a query string w by Eq. 11. A generates shares of q and distributes them to P 0 and P 1 . The entire protocol is described in Protocol 5. Security We only show sketch of the proof. All the messages exchanged between P 0 and P 1 are shares except for Line 6. In Line 6, reconstructed value p j is randomized by r f [j] in Line 5. Therefore, no information is leaked.

Complexities
In DB preparation phase, shares of R

Experiment
We implemented Protocol 3 (Secure LPM), Protocol 4 (Secure LMEM) and Protocol 5. For comparison, we also implemented baseline protocols (Baseline LPM and Baseline LMEM). Details of the baseline protocols are provided in Appendix 3. All protocols were implemented by Python 3.5.2. The dataset was created from Chromosome 1 of the human genome. We extracted substrings of  17:9 length N = 10 3 , 10 4 , 10 5 , 10 6 , and 10 7 for databases, and ℓ = 10 , 25, 50, 75, and 100 for queries. Share was run with n = 16 and n = 32 for N < 10 5 and 10 5 ≤ N in the proposed protocols, and n = 1 for a Boolean share and n = 8 for an arithmetic share in the baseline protocols. We did not implement a data transfer module, and each protocol is implemented as a single program. Therefore, the search time of the protocols was measured by the time consumed by either one of P 0 and P 1 . To assess the influence of communication on a realistic environment, we theoretically estimated delays caused by network bandwidth and latency. We assume three environments: LAN (0.2 ms/10 Gbps), WAN 1 (10 ms/100 Mbps), and WAN 2 (50 ms/10 Mbps). During the run of Search phase, we stored all the data that were transferred from P 0 to P 1 in a file and measured the file size as an actual communication size. Note that the communication is symmetric and data transfer size from P 0 to P 1 is equal to that from P 1 to P 0 . Based on the data transfer size D byte, we estimate the communication delay by D/k + eT /1000 , where k is bandwidth, e is latency and T is a round of communication. All the protocols were run with a single thread on the same machine equipped with Intel Xeon 2.2 GHz CPU and 256 GB memory. We also tested the C++ implementation of [19], which is based on AHE.
The algorithm for LPM in [17] for the string with | | ≤ 4 (e.g., genome sequence) is the same as [19]. Sudo et al. [19] is implemented as a server-client software, and the client and the server were run with individual single threads on the same machine. Therefore, the results of [19] do not include delays caused by bandwidth limitation and latency, so we also estimated delays based on the data transfer size and round of communication in the same manner. Each run of the program was terminated if the total runtime of all phases exceeded 20 h. Table 3  . . , 100 for N = 10 6 . We can not show the results of Baseline LMEM because none of its runs were finished within the time limit. As shown in the graph, the time of Secure LPM increases linearly to ℓ and that of Baseline LPM increases proportionally to ℓ 2 , which are in good agreement with the theoretical complexities in Table 2. According to the graph, the time of Secure LMEM also increases linearly to ℓ though its time and communication complexities are O(ℓ 2 ) . This is because the CPU times are much smaller than the delays caused by network latency that are influenced by the round complexity O(ℓ).

Comparison to baseline protocols
We have preliminary results for testing Secure LPM and Baseline LPM on the actual network (10 ms/100 Mbps). The results were 40 s for Secure LPM and 1739 s for Baseline LPM when N = 10 6 . Though both of the preliminary implementations have room for improvement in the performance of data transfer, the results also indicate that our protocol outperforms the baseline protocol and the previous study.
The time and size of Secure LPM and Secure LMEM are several orders of magnitude better than those of the baseline protocols for the offline phase, and vice versa for the DB preparation phase. The total time of the offline and DB preparation phases of our protocols are more than one order magnitude faster than that of baseline protocols. For the total size of the offline and DB preparation phases, Secure LMEM was better than Baseline LMEM, but Baseline LPM was better than Secure LPM though the complexity is better for Secure LPM. This is because the majority of the shares were Boolean in the baseline protocols, while all of the shares were arithmetic in the proposed protocols.

Comparison to [19]
[19] is a two-party MPC based on AHE. Each homomorphic operation is time consuming and has no offline and DB preparation phases. As shown in Table 3, the Search time of Secure LPM is four orders of magnitude faster than [19] for N = 10 6 . Since time complexity of [19] includes a factor of N, the difference in Search time becomes greater as N becomes large. Moreover, our protocols have a further advantage in communication for a query response when the network environment is poor, as the round complexity of [19] and our protocols are the same while [19] requires O( √ N ) communication size. The entire runtimes including all the phases are still six times faster for N = 10 5 and N = 10 6 . We can compute LMEM by examining [19] for all the positions in a query string, but this approach consumed 3406 s and 2.6 GByte of communication for N = 10 4 .

Result of the approach in section "Reducing size of shares in DB preparation phase"
We also implemented Protocol 5 (Secure LPM2) to investigate a trade-off between reduction of the size of shares in DB preparation phase and increase in search time and communication overhead in Search phase. We used the same programming language (i.e., Python 3.5.2) for the implementation and used the same datasets. Share was run with n = 8 when generating the arithmetic shares of R. For the generation of rest of the arithmetic shares, Share was run with n = 16 and n = 32 for N < 10 5 and 10 5 ≤ N . (i.e., m = 8 , n = 16 ( N < 10 5 ), and n = 32 ( 10 5 ≤ N ) for the notation used in section "Reducing size of shares in DB preparation phase"). The results are shown in Table 3. The total size of shares in DB preparation phase was 7.7GB for Protocol 5 and 30.5GB for Protocol 3, which is in good agreement with the theoretical complexities discussed in section "Reducing size of shares in DB preparation phase". The search time of Protocol 5 is around 2 s longer than that of Protocol 3. We consider the increase in search time is mainly caused by using rather costly subprotocols: CastUp , Comp and MULT more times, which also increases the number of communication rounds. Although the increase in search time, Protocol 5 is still more than two orders of magnitude faster than Baseline LPM and three orders of magnitude faster than [19], so we consider that Protocol 5 offers a reasonable trade-off between performance in DB preparation phase and Search phase.

Discussion
As clearly shown by the results, Search time of the proposed protocols are significantly efficient. Considering the importance of query response time for real applications, it is realistic to reduce Search time at the cost of DB preparation time. Since the total times for offline and DB preparation phases of the proposed protocols were significantly better than those of the well-designed baseline protocols, we consider the trade-off between Search and DB preparation times in our approach to be efficient. For further reduction of DB preparation time, parallelizing the share generation is a feasible approach. Regarding the DB preparation phase, the data transfer between the server and the computing nodes is problematic when the number of queries and the length of the database are large. To mitigate the problem, we also proposed the approach that uses arithmetic shares of a shorter bit length, which offers a reasonable trade-off between the reduction of data size in DB preparation phase and the increase in time and communication overhead in Search phase. Another solution that potentially mitigate the problem is to use an AES-based random number generation that is similar to the technique used in [33]. To explain it briefly, when the server needs to distribute a share of x, (1) the server and P 0 generate the same randomness r using a pre-shared key and a pseudorandom function, and (2) the server computes x − r and sends it to P 1 . Although P 0 's computation cost increases, we can remove the data transfer from the server to P 0 . In our protocols, the generation of shares in the DB preparation phase cannot be outsourced because they are dependent on the database. Designing an efficient algorithm to outsource the share generation is an important open question.

Appendix 1: Examples of a aearch with FM-Index and auxiliary data structures
Let us show examples of a search with FM-Index, LCP array, PSV and NSV. In addition to the data structures defined in section "Index structure for string search", we also define a string F such that F  Figure 7 illustrates the example of a backward search to find the longest suffix of the query (ATG) that matches the database, and Fig. 8 illustrates the search for MEMs with the query (CGC) by using LCP array, PSV, and NSV. As shown in the upper center panel of Fig. 8, the search failed when the backward search with 'C' after finding the interval [7,8) that corresponds to GC. Since LCP [8] ≤ LCP [7] , the parent lcp-interval becomes [PSV [7] = 5, NSV[7] = 8) , which corresponds to 'G' . The match CG is then searched with the backward search with 'C' from the parent lcp-interval.