Identification of bifurcation transitions in biological regulatory networks using AnswerSet Programming
 Louis Fippo Fitime^{1, 3},
 Olivier Roux^{1},
 Carito Guziolowski^{1}Email author and
 Loïc Paulevé^{2}View ORCID ID profile
https://doi.org/10.1186/s1301501701103
© The Author(s) 2017
Received: 1 January 2017
Accepted: 7 July 2017
Published: 20 July 2017
Abstract
Background
Numerous cellular differentiation processes can be captured using discrete qualitative models of biological regulatory networks. These models describe the temporal evolution of the state of the network subject to different competing transitions, potentially leading the system to different attractors. This paper focusses on the formal identification of states and transitions that are crucial for preserving or preempting the reachability of a given behaviour.
Methods
In the context of nondeterministic automata networks, we propose a static identification of socalled bifurcations, i.e., transitions after which a given goal is no longer reachable. Such transitions are naturally good candidates for controlling the occurrence of the goal, notably by modulating their propensity. Our method combines AnswerSet Programming with static analysis of reachability properties to provide an underapproximation of all the existing bifurcations.
Results
We illustrate our discrete bifurcation analysis on several models of biological systems, for which we identify transitions which impact the reachability of given longterm behaviour. In particular, we apply our implementation on a regulatory network among hundreds of biological species, supporting the scalability of our approach.
Conclusions
Our method allows a formal and scalable identification of transitions which are responsible for the lost of capability to reach a given state. It can be applied to any asynchronous automata networks, which encompass Boolean and multivalued models. An implementation is provided as part of the Pint software, available at http://loicpauleve.name/pint.
Introduction
The emerging complexity of dynamics of biological networks, and in particular of signalling and gene regulatory networks, is mainly driven by the interactions between the species, and the numerous feedback circuits they generate [1–4]. One of the prominent and fascinating features of cells is their capability to differentiate: starting from a multipotent state (for instance, a stem cell), cellular processes progressively confine the cell dynamics in a narrow state space, an attractor. Deciphering those decision processes is a tremendous challenge, with important applications in cell reprogramming and regenerative medicine.
Qualitative discrete models of network dynamics, such as Boolean and multivalued networks [5, 6], have been designed with such an ambition. These frameworks model nodes of the network by variables with small discrete domains, typically Boolean. Their value changes over time according to the state of their parent nodes. Exploring the dynamical properties of those computational models, such as reachability, i.e., the ability to evolve to a particular state, or attractors, i.e., the longrun behaviours, allows understanding part of important cellular processes [7–9].
Besides extracting precise knowledge on differentiation mechanisms in the discrete dynamics of the network, bifurcation transitions can in fine suggest drug targets for controlling cellular differentiation and/or counteracting pathological behaviours. Indeed, if it is ensured that the bifurcation is triggered in the appropriate state, then the reachability of a state of interest would be certainly prevented. On the other hand, blocking all bifurcation transitions in the appropriate states would ensure that the state of interest is inevitably reached.
In this article, we formally introduce the notion of bifurcation transitions in discrete dynamics of automata networks (ANs) and we provide a scalable method for their identification that relies on declarative programming with AnswerSet Programming (ASP) [10]. ANs allow encoding exactly the dynamics of asynchronous Boolean and multivalued networks which are also known as Thomas networks [11]. We first show that bifurcation transitions can be completely identified using computationtree temporal logic (CTL). However, this characterization relies extensively on the reachability problem, which is PSPACEcomplete in ANs and similar frameworks [12], which limits its tractability. The main contribution of this paper is the introduction of an approximation of the bifurcation identification which is NP. In order to obtain an approach tractable on large biological networks, we show a combination of methods of static analysis of ANs dynamics [13, 14], concurrency theory, and constraint programming for relaxing efficiently the bifurcation problem. Our method identifies correct bifurcations only (no false positives) but, due to the embedded approximations, is incomplete (false negatives may exist). To our knowledge, this is the first integrated method to extract bifurcation transitions from discrete models of large interaction networks.
The output of our method is a set of transitions, for instance “activation of gene x by active genes y and z”, and optionally the set of states in which their occurrence removes the capability to reach the goal. It is worth noticing that bifurcation transitions are transitions of the input model which play a crucial role for the goal reachability. They do not directly provide targets for controlling the system. Therefore, bifurcation transitions are different from intervention sets [15, 16] or cut sets [17, 18] which propose perturbations to apply on a system in order to enforce/prevent the occurrence of a state/reaction of interest. Whereas these predictions can help to control the reachability of an attractor, they do not allow to directly understand the structure of the original model dynamics, notably how the different attraction basins are connected. Bifurcation transitions precisely indicate when and how the system exits a state where a capability was reachable.
Background
Automata networks
An AN is a finite set of finitestate machines that have transitions between their local states determined by the state of other automata in the network. The global state space of the network is the product of the local states of the individual automata. The local transitions specify the current and successor local state of an automaton, possibly constrained by the state of other automata.
Definition 1

\(\Sigma\) is the finite set of automata identifiers;

For each \(a\in \Sigma\), \(S(a) = \{a_i,\dots ,a_j\}\) is the finite set of local states of automaton a; \(S \mathop {=}\limits ^{\Delta }\prod _{a\in \Sigma } S(a)\) is the finite set of global states; \(L\mathop {=}\limits ^{\Delta }\bigcup _{a\in \Sigma } S(a)\) denotes the set of all the local states.

\(T = \{ a \mapsto T_a \mid a\in \Sigma \}\), where \(\forall a\in \Sigma , T_a \subseteq S(a)\times 2^{L\setminus S(a)} \times S(a)\) with \((a_i,\ell ,a_j)\in T_a \Rightarrow a_i\ne a_j\) and \(\forall b\in \Sigma , \ell \cap S(b) \le 1\), is the mapping from automata to their finite set of local transitions.
A local transition \(t= {a}_{i} \xrightarrow {\ell } {a}_{j}\in T\) is applicable in a global state \(s\in S\) when \(a_i\) and all the local states in \(\ell\) are in s. The application of the local transition, noted \(s\cdot t\), replaces the local state of a with \(a_j\) (Definition 2).
Definition 2
In this paper, we consider the asynchronous semantics of ANs: only one local transition can be applied at a time. In this asynchronous semantics, different local transitions may be applicable to the same state, each of them leading to different behaviours. The choice of the transition is nondeterministic. A global state \(s'\) is reachable from s, noted \(s\rightarrow ^{*}s'\), if and only if there exists a (possibly empty) sequence of transitions leading from s to \(s'\). Finally, an attractor is a smallest set of states from which no transition can exit. They correspond to the longterm dynamics of the network:
Definition 3

A is strongly connected w.r.t. \(\rightarrow ^{*}\): \(\forall s,s'\in A, s\rightarrow ^{*}s'\); and

A is terminal w.r.t. \(\rightarrow ^{*}\): \(\forall s\in A\), \(\exists s'\in S: s\rightarrow ^{*}s' \Rightarrow s'\in A\).
Encoding Boolean and Thomas networks with automata networks
The asynchronous semantics of any Boolean network or Thomas (multivalued) network can be encoded equivalently with ANs [11]. Note that, according to Thomas networks semantics, the transitions increment or decrement by one the level of node. Hence, ANs encoding Thomas networks have only transitions of the form \({a}_{i} \xrightarrow {\ell } {a}_{j}\) with \(ij=1\).
Tools such as BioLQM ^{1} provide automatic translations from standard model formats for Boolean/Thomas networks to ANs.
Reachability and formal approximations
In this section, we give a brief overview of the basics of reachability checking, stressing the methods we use in this paper.
State graph and partial order reductions
Given two states \(s,s'\) of an AN (or an equivalent Petri net), verifying \(s\rightarrow ^{*}s'\) is a PSPACEcomplete problem [12].
The common approach for reachability checking is to build the (finite) set of all the states reachable from s until finding \(s'\), by exploring all the possible transitions. However, such a set can be rapidly intractable with large models. Techniques relying on symbolic representations, notably using binary decision diagrams (BDDs) and variants [19] can improve the scalability of this approach by several orders of magnitude [20].
Experimental results for the identification of bifurcation transitions depending if \((\mathrm{I3})\) or \((\mathrm{I3}^\#)\) is used, compared to a exact model checking (MC) using NuSMV [20]
Automata network  States  Goal  MC (NuSMV)  With \((\mathrm{I3})\)  With \((\mathrm{I3}^\#)\)  

\(t_b\)  Time (s)  \(t_b\)  Time (s)  \(t_b\)  Time (s)  
Lambda phage \(\Sigma =4\quad T=11\)  14  \(\mathrm {CI}_2\)  10  0.1  6  0.1  0  0.2 
\(\mathrm {Cro}_2\)  3  0.1  3  0.1  2  0.3  
EGF/TNF \(\Sigma =28\quad T=55\)  3698  \(\mathrm {NFkB}_0\)  5  0.2  4  0.1  2  0.1 
\(\mathrm {IKB}_1\)  5  0.2  3  0.1  2  0.1  
Th_th1 \(\Sigma =101\quad T=381\)  ≈3.10^{11}  \(\mathrm {BCL6}_1\)  8  13  6  16  5  23 
\(\mathrm {TBET}_1\)  11  14  5  10  4  24  
Th_th2 \(\Sigma =101\quad T=381\)  ≈10^{12}  \(\mathrm {GATA3}_1\)  9  108  8  24  7  20 
\(\mathrm {BCL6}_1\)  7  570  5  25  4  25  
Th_pluri \(\Sigma =101\quad T=381\)  >5.10^{14}  BCL6_{1} IL21_{1} FOXP3_{1} TGFB_{1}  Outoftime  Outoftime  2  32  
0  26  
4  56  
5  96 
In this paper, one of our methods uses complete finite prefixes of unfoldings to compute the states that are reachable from a given initial state. Indeed, because biological networks are typically very large, but also very sparse (each node/automaton interacts with a few others, compared to the size of the network), they exhibit a high degree of concurrency for their transitions, making unfolding approaches very effective in practice.
Formal approximations
When facing a large AN, it may turn out that the reachable state space is too large for the aforementioned exact verification of reachability. Moreover, the complexity of the reachability problem can be prohibitive when numerous verifications have to be done, for instance when enumerating candidate initial states.
The core objects are the objectives and their local paths within two local states \(a_i\), \(a_j\) of a same automaton a. We call \({{a}_{i}}\!\leadsto \!{{a}_{j}}\) an objective and define \(\mathrm{local}\text{}\mathrm{paths}({{a}_{i}}\!\leadsto \!{{a}_{j}})\) the set of the acyclic paths of local transitions between \(a_i\) and \(a_j\). Definition 4 gives the formalization of \(\mathrm{local}\text{}\mathrm{paths}\) where we use the following notations. Given a local transition \(t= {a}_{i} \xrightarrow {\ell } {a}_{j}\in T\), \(\mathrm{orig}(t)\mathop {=}\limits ^{\Delta }a_i\), \(\mathrm{dest}(t)\mathop {=}\limits ^{\Delta }a_j\), \(\mathrm{enab}(t)\mathop {=}\limits ^{\Delta }\ell\). Given \(z\in \mathbb N\), \({\tau }=({\tau }^n)_{n=1,\dots ,z}\) is a sequence of local transitions indexed by \(n\in \{1,\dots ,z\}\); \({\tau }=z\) is the length of the sequence \({\tau }\); and \(\varepsilon\) denotes the empty sequence (\(\varepsilon =0\)).
Definition 4

If \(i=j\), \(\mathrm{local}\text{}\mathrm{paths}({{a}_{i}}\!\leadsto \!{{a}_{i}})\mathop {=}\limits ^{\Delta }\{\varepsilon \}\);

If \(i\ne j\), a sequence \({\tau }\) of transitions in T(a) is in \(\mathrm{local}\text{}\mathrm{paths}({{a}_{i}}\!\leadsto \!{{a}_{j}})\) if and only if it satisfies the following properties:

\(\mathrm{orig}({\tau }^1) = a_i\), \(\mathrm{dest}({\tau }^{{\tau }})=a_j\),

\(\forall n, 1\le n< {\tau }\), \(\mathrm{dest}({\tau }^n) = \mathrm{orig}({\tau }^{n+1})\),

\(\forall n,m, {\tau }\ge n > m\ge 1, \mathrm{dest}({\tau }^n)\ne \mathrm{orig}({\tau }^m)\).

In the AN of Fig. 1, \(\mathrm{local}\text{}\mathrm{paths}({{a}_{0}}\!\leadsto \!{{a}_{2}}) = \{ ( {a}_{0} \xrightarrow {b_0,c_0} {a}_{2})\}\); \(\mathrm{local}\text{}\mathrm{paths}({{c}_{0}}\!\leadsto \!{{c}_{2}})=\{ ( {c}_{0} \xrightarrow {a_1} {c}_{1}, {c}_{1} \xrightarrow {b_0} {c}_{2}) \}\); \(\mathrm{local}\text{}\mathrm{paths}({{c}_{2}}\!\leadsto \!{{c}_{1}})=\emptyset\).
Focusing on the reachability of a single local state \(g_1\) from a state s where \(s({g})=g_0\), the analyses essentially start with the local paths in \(\mathrm{local}\text{}\mathrm{paths}({{g}_{0}}\!\leadsto \!{{g}_{1}})\): if \(g_1\) is reachable, then at least one of the local paths \({\tau }\) has to be realizable, meaning that all the local states of its conditions (\({\mathrm{enab}({\tau })}\)) should be reachable. This leads to a recursive reasoning by repeating the procedure with the objectives from s to the local states in \({\mathrm{enab}({\tau })}\).
The dependence relationships between the local paths of the different automata can be represented as a graph, where the nodes are all the local states, all the possible objectives, and all their local paths. Such a graph is called a Local Causality Graph (LCG), and abstracts all the executions of the AN.
Definition 5
From a complexity point of view, local paths are computed for each pair of local states within every automata. Since the length of a local path is at most the number of local states within the automaton, the number of local paths is at most polynomial in the number of local transitions and exponential in the size of the single automaton. In practice, the automata are small, typically between 2 and 4 states for biological models. Therefore, LCGs turn out to be very small compared to the reachable state space of biological networks. They have been successfully applied for analysing dynamics of ANs with hundreds or thousands of automata, which were intractable with standard model checking approaches [13, 17].
The overapproximation and underapproximation reduce to finding subgraphs of LCGs that satisfy some particular structural properties, which have been proven to be necessary or sufficient for the reachability property, respectively. The overapproximation reduces here to finding an acyclic subgraph that contains the main objective \({{g}_{0}}\!\leadsto \!{{g}_{1}}\) where leaves are empty local paths, and initial states match with the given initial state. This condition can be verified in a time linear with the LCG size [13]. The underapproximation we consider in the paper requires to find an acyclic subgraph where all leaves are empty local states, where conditions of local paths (\({\mathrm{enab}({\tau })}\)) are independent, and which contain all possible objectives that can be involved for the goal reachability [14]. This requires enumerating over many possible subLCGs, but checking if a subLCG satisfies the sufficient condition is linear in its size, leading to an NP formulation.
Theorem 1
(Reachability overapproximation [13]) Given a state \(s\in S\), \(g_1\in L\) is reachable from s, i.e., there exists \(s'\in S\) such that \(s\rightarrow ^{*}s'\), only if \({s({g})}\!\leadsto \!{g_1}\in \Omega\), where \(\Omega \subseteq \mathcal O\) is the least fixpoint of the monotonic function \(\mathrm{F}:2^{\mathcal O}\rightarrow 2^{\mathcal O}\) with \(\mathrm{F}(\Omega ) \mathop {=}\limits ^{\Delta }\{ {{a}_{i}}\!\leadsto \!{{a}_{j}}\in \mathcal O\mid \exists {\tau }\in \mathrm{local}\text{}\mathrm{paths}({{a}_{i}}\!\leadsto \!{{a}_{j}}): \forall b_k\in {\mathrm{enab}({\tau })}, {s({b})}\!\leadsto \!{b_k}\in \Omega \}.\)
Theorem 2

\(g_1\in L'\);

\(\forall a_j\in L'\), \((a_j,{s({a})}\!\leadsto \!{a_j})\in E'\) and \(\forall a_i\in L',a_i\ne a_j\), \((a_j,{{a}_{i}}\!\leadsto \!{{a}_{j}})\in E'\);

\(\forall {{a}_{i}}\!\leadsto \!{{a}_{j}}\in \mathcal O'\), \(\exists {\tau }\in \mathrm{local}\text{}\mathrm{paths}({{a}_{i}}\!\leadsto \!{{a}_{j}}): ({{a}_{i}}\!\leadsto \!{{a}_{j}},{\tau })\in E'\),

\(\forall {\tau }\in P', \{ ({\tau },b_k) \in E\}\subseteq E'\);

\((L',\mathcal O',P',E')\) is acyclic

\(\forall {\tau }\in P'\), \(\forall n\in \{1,\dots ,{\tau }\}\), there exists at most one \(a_i\in \mathrm{enab}({\tau }^n)\) such that \(\forall b_j\in \mathrm{enab}({\tau }^n),b_j\ne a_i\), \(S(a) \cap {\text {conn}}_{E'}(b_j) \nsubseteq \{a_i\}\).
Figure 2 gives examples of subLCGs which approximate the reachability of \(a_2\) in the AN of Fig. 1. The left LCG does not satisfy the necessary condition (no local paths from \(c_2\) to \(c_0\)), hence \(a_2\) is not reachable from the given initial state \(\langle a_1,b_0,c_2 \rangle\). The middle LCG does satisfy the necessary condition. Finally, the right LCG is a valid subLCG for the sufficient condition for \(a_2\) reachability. Whereas these examples show only acyclic LCGs, in general, cycles can exist in the causality analysis, revealing cyclic (nonsolvable) dependencies between transitions.
ASP syntax and semantics
AnswerSet Programming allows for automatic logical deductions thanks to an ASP model which declares variables, domains, and constraints, and to a solver which computes the solutions, possibly accounting for optimisation criteria. It is close to SAT (propositional satisfiability) [22] and known to be efficient for enumerating solutions of NP problems while providing a convenient language for specifying the model.
We give a very brief overview of ASP syntax and semantics that we use in the next section. Please refer to [10, 23, 24] for an indepth introduction to ASP.
An ASP program is a Logic Program (LP) formed by a set of logical rules, composed of first order logic predicates, of the form:
Results
Bifurcations
Given an initial state \(s_0\) and a goal local state, a bifurcation transition is a transition from a state where the goal is reachable to a state where the goal is not reachable, i.e., there exists no sequence of transitions that leads to a state containing the goal local state. This implies that there exists at least one reachable attractor which does not contain a goal state.
Let us consider the AN of Fig. 1, with \(s_0=\langle a_0,b_0,c_0 \rangle\) and the goal \(a_2\). Figure 3 shows all the possible transitions from \(s_0\).
The states with a grey background are connected to a state containing \(a_2\) (in thickblue). The transitions in thickred are bifurcations: once in a white state, there exist no sequence of transitions leading to \(a_2\). The white states constitute an attractor of the state graph from which it is not possible to reach a state containing \(a_2\). In other words, bifurcations are the transitions from a grey state to a white state. Note that each transition between two global states is generated by one (and only one) local transition in the AN. In this example, \(t_8\) is the (unique) local transition responsible for bifurcations from \(s_0\) to \(a_2\).
Given an AN \((\Sigma ,S,T)\), we search to identify the local transitions \(t_b\in T\) which trigger a bifurcation from a state reached from \(s_0\in S\) for a given goal, which describes a set of states \(S_g\subseteq S\). We call \(s_b\) a global state where a bifurcation occurs, and \(s_u\) the global state after the bifurcation: \(s_u = s_b\cdot t_b\). The goal is reachable from \(s_b\) but not from \(s_u\). This is illustrated by Fig. 4. Note that, as illustrated, \(s_b\) is not inevitably reached: we allow the existence of alternative paths of transitions to the goal.
Definition 6 formalizes the notion of bifurcation, where the goal is specified by a local state \(g_1\) (hence \(S_g=\{ s\in S\mid s({g})=g_1\}\)). Note that this goal specification does not loose generality, as one can build an automaton g with local states \(g_0\) and \(g_1\), and with a local transitions from \(g_0\) to \(g_1\) conditioned by each desired goal state.
Definition 6
(Bifurcation transition) Given an AN \((\Sigma ,S,T)\), a global state \(s_0\in S\) and a goal local state \(g_1\) with \(g\in \Sigma\) and \(g_1\in S(g)\), a bifurcation transition is a transition \(s_b\xrightarrow {t_b} s_u\) of the AN with \(s_b,s_u\in S\) and \(t_b\in T\), such that (1) \(s_0\rightarrow ^{*}s_b\); (2) \(\exists s\in S\) where \(s({g})=g_1\) with \(s_b\rightarrow ^{*}s\); and (3) \(\forall s'\in S\) where \(s_u\rightarrow ^{*}s'\), \(s'({g}) \ne g_1\).
Alongside the enumeration of candidate \(s_b\) and \(t_b\), reachability checking is at the core of the bifurcation identification.
As explained in the introduction, verifying such a CTL property is a PSPACEcomplete problem. In the rest of this paper, we introduce NP approximations of the bifurcation property that can be verified by a SAT/ASP solver.
Identification of bifurcations using ASP
Among the states reachable from \(s_0\), we want to find a state \(s_b\) from which (1) the goal is reachable and (2) there exists a transition to a state from which the goal is not reachable. Putting aside the complexity of reachabilities checking, the enumeration of candidate states \(s_b\) is a clear bottleneck for the identification of bifurcations in an AN.
Our approach combines the formal approximations and (optionally) unfoldings introduced in the previous section with a constraint programming approach to efficiently identify bifurcations. As discussed in the previous section, checking the over/underapproximations from candidate states and subLCGs is easy. For the case of unfolding, checking if a state s belongs to the state space represented by a complete finite prefix is NPcomplete [26]. Therefore, a declarative approach such as ASP [10] is very well suited for specifying admissible \(s_b\) and \(t_b\), and obtaining efficient enumerations of solutions by a solver.
We first present the general scheme of our method, and then given details on its implementation with ASP.
General scheme
Relying on the unfolding from \(s_b\) (\(\mathrm{unf}\text{}\mathrm{prefix}(s_b)\)) is not considered here, as it would require to compute a prefix from each \(s_b\) candidate, whereas \(\mathrm{unf}\text{}\mathrm{prefix}(s_0)\) is computed only once before the bifurcation identification.
Complexity
The decision of \((\mathrm{I1}^\#)\), \((\mathrm{I2}^\#)\), and \((\mathrm{I3}^\#)\) can be formulated as NP problems in the size of the LCG. Recall that the size of the LCG is polynomial with the number of local states and local transitions in the AN, and exponential with the number of local states within a single automaton.
The decision of \((\mathrm{I3})\) is NPcomplete with respect to the size of the prefix of the unfolding, which computation is PSPACE [12]. Nevertheless, checking if \((\mathrm{I1}^\#)\), \((\mathrm{I2}^\#)\), and \((\mathrm{I3})\) are satisfied can remain more tractable than checking the exact CTL property: \((\mathrm{I3})\) uses the (complete) set of reachable states, but does not require the transitions.
ASP implementation
We present here the main rules for implementing the identification of bifurcation transitions with ASP. A significant part of ASP declarations used by \((\mathrm{I1}^\#)\), \((\mathrm{I2}^\#)\), \((\mathrm{I3})\), and \((\mathrm{I3}^\#)\) are generated from the prior computation of \(\mathrm{local}\text{}\mathrm{paths}\) and, in the case of \((\mathrm{I3})\), of the prefix of the unfolding. Applied on Fig. 1, our implementation correctly uncovers \(t_8\) as a bifurcation for \(a_2\).
Problem instance: local states, transitions, and states Every local state \(a_i\in S(a)\) of each automaton \(a\in \Sigma\) is declared with the predicate 1s(a, i). We declare the local transitions of the AN and their associated conditions by the predicates tr(id, a, i, j) and trcond(id, b, k), which correspond to the local transition \({a}_{i} \xrightarrow {\{b_k\}\cup \ell } {a}_{j}\in T\). States are declared with the predicate s(ID, A, I) where ID is the state identifier, and A, I, the automaton and local state present in that state. Finally, the goal \(g_1\) is declared with goal(g, 1).
Experiments
We evaluated our method in three real biological networks case studies that show differentiation capabilities. We selected networks that show at least two attractors reachable from the same initial state. For each network, we supplied a goal state representing one attractor. Thus, the goal state is a state reachable from the selected initial state. Because at least one attractor is reachable from the same selected initial state, transitions that lead to the other attractors are by definition bifurcation transitions. We aimed at identifying transitions that cause a bifurcation for the reachability of the goal state. The three case studies used are briefly described in the following paragraphs.
Models, initial states, and goals
Immunity control in bacteriophage lambda (Lambda phage)
In temperate bacteriophages the choice of entering lysis and lysogenization cycles is controlled by bacterial and viral genes. In the lambda case, at least five viral genes (refered to as cI, cro, cII, N and cIII) and several bacterial genes were identified. We applied our method on an AN equivalent to the model introduced in [28]. Based on this study we selected one initial state and two different goals, corresponding to lysis or lysogenization phases both reachable from the initial state. The lysis phase is characterized by the attractor \(\{\langle CI_0, Cro_2, CII_0, N_0 \rangle , \langle CI_0, Cro_3, CII_0, N_0 \rangle \}\), while the lysogenization phase, by \(\{\langle CI_2, Cro_0, CII_0, N_0 \rangle , \langle CI_2, Cro_0, CII_1, N_0 \rangle \}\). The initial state was \(\langle CI_0, Cro_0, CII_0, N_0 \rangle\). The selected goals where \(CI_2\) (lysogenization attractor) and \(Cro_2\) (lysis attractor). One can not access the lysogenization goal from the lysis attractor and vice versa.
Epidermal growth factor and tumor necrosis factor\(_{\alpha }\)
EGF/TNF is a model that combines two important mammalian signaling pathways induced by the epidermal growth factor (EGF) and tumor necrosis factor alpha (TNF\(_{\alpha }\)) [29, 30]. EGF and TNF\(_{\alpha }\) ligands stimulate ERK, JNK and p38 MAPK cascades, the PI3K/AKT pathways, and the NFkB cascade. This network of 28 components encompasses crosstalks between these pathways as well as two negative feedback loops. We applied our method from the initial state corresponding to the signal TNF\(_\alpha\) active and EGF inactive; the two goals refer to downstream proteins, namely the inactivation of NBkB and the activation of its inhibitor, IKB.
Thelper cell plasticity
Thelper cell has been studied in [8] in order to investigate switches between attractors subsequent to changes of input conditions. It is a cellular network regulating the differentiation of Thelper (Th) cells, which orchestrate many physiological and pathological immune responses. Thelper (CD4+) lymphocytes play a key role in the regulation of the immune response. By APC activation, native CD4 T cells differentiate into specific Th subtypes producing different cytokines which influence the activity of immune effector cell types. Differentiation in one subtype rather than another depends on the presence of specific polarizing cytokine combinations. These different lineages are characterized by a set of cytokines they express under the control of a master regulator transcriptional factor. Each master regulator is critically involved in the driving of the differentiation of the Th lineage they specify. The network is composed of 101 nodes and 221 interactions; the corresponding AN has in total 381 local transitions. Note that due to the very high number of reachable states from some particular initial states of the network, the authors in [8] had to analyse a reduced version of this network, which does not preserve all the reachability properties. In this work, we analyse the full model. We selected initial states and goals for this model according to the attractors identified in [8].
We applied our method for three different initial states, namely th1, th2, and pluri. The two formers are arbitrary initial states from which particular subtypes (Th1 and Th2, respectively) are reachable. The “pluri” initial state corresponds to a potential cell environment which can trigger a differentiation among different cell subtypes (the differentiation is nondeterministic in the Boolean model): the initial states specify that APC, IL1B\(_e\), IL25\(_e\), IL27\(_e\), IL29\(_e\), IL2\(_e\), IL33\(_e\), IL36\(_e\), IL4\(_e\), and TGFB\(_e\) (\(_e\) stands for environment) are active, and only them.
In all cases, the goals correspond to the activation of master regulators and cytokines which are specific markers for differentiated Th subtypes.
Methods
 1.
Exact model checking using NuSMV [20]: for each local transition in the AN specification, we verify if it is a bifurcation transition according to the CTL formula given in Eq. 1. This identification is exact and complete, but has a high theoretical complexity.
 2.
ASP solving of \((\mathrm{I1}^\#)\), \((\mathrm{I2}^\#)\), and \((\mathrm{I3})\) (computation of the reachable states set from \(s_0\)). We use clingo 4.5.3 [31] as ASP solver, and Mole [32] for the computation of the complete finite prefix for \((\mathrm{I3})\). This identification is exact but incomplete: some bifurcation transitions can be missed.
 3.
ASP solving of \((\mathrm{I1}^\#)\), \((\mathrm{I2}^\#)\), and \((\mathrm{I3}^\#)\) (reachability underapproximation). We use clingo 4.5.3 [31] as ASP solver. This identification is exact but incomplete: some bifurcation transitions can be missed. Due to the additional approximations brought by \((\mathrm{I3}^\#)\) compared to \((\mathrm{I3})\), it is expected that less bifurcation transitions can be identified with this latter approach, but with a higher scalability.
Results
Table 1 summarizes the results of the identification of bifurcation transition for the models, initial states and goals described above. In the remainder of this section, we discuss two aspects of these results: the scalability of our approach and the biological interpretation of the identified bifurcations.
Scalability
For the analysed models, exact model checking and approximation using \((\mathrm{I3})\) give comparable execution times, with nevertheless an advantage for \((\mathrm{I3})\) in most cases. Because the model checking approach is exact, the identified bifurcation transitions is complete, whereas, due to \((\mathrm{I1}^\#)\) and \((\mathrm{I2}^\#)\) approximations, the second approach generally identifies less bifurcation transitions. As supported by the experiments on Th_th2, the computation of \((\mathrm{I3})\) should be, in practice, more tractable than the verification of the CTL expression of Eq. 1. Indeed, \((\mathrm{I3})\) requires only to compute the set of reachable states, where CTL verification requires, in addition, to store the transitions between these states.
Importantly, both methods fail on the Th_pluri model (no result after 2 h). This can be explained by the very large reachable dynamics. In the case of model checking, we emphasize that NuSMV fails due to the size of the model, and it has been able to verify none of the supplied CTL properties. In the case of \((\mathrm{I3})\), the failure is due to the complete finite prefix computation which does not terminate in due time; this suggests that the reduction relying on concurrent transitions is not sufficient for this particular model to achieve a tractable representation of the reachable state space. Future work may consider other symbolic representations of the reachable state space, notably using BDDs and variants [19].
The third approach, using the additional approximation \((\mathrm{I3}^\#)\) is tractable on the large model, supporting a higher scalability of this latter approach. Indeed, the computation of the finite complete prefix for \((\mathrm{I3})\) is PSPACEcomplete, solving \((\mathrm{I3}^\#)\) is NP (with LCG size). Whereas, the difference between PSPACE and NP complexity classes is not known, it is a common observation in practice that NP solving (notably using SAT) is more tractable than PSPACE solving. As expected, in the smaller models, less bifurcation transitions than the former approaches are returned. Concerning the ASP grounding and solving computation times (data not shown) the grounding time depends on the model size and is independent of the choice of the initial state and goal; whereas in the case of the solving time, the choice of the initial state may have an important impact. This effect appears much more visible in the larger Thelper model. Grounding time has very small and similar values (\(\approx\)0.05s) for the small and middle size models (4–22 automata and 11–55 transitions). However in the larger model (six times more transitions) the grounding time raises to 2 orders of magnitude. Solving time behaves differently, while it remains small and similar for small and middle size models. It raises to 4 orders of magnitude in the case of the larger model. Across all studied models the proportion of grounding and solving time against total computation time varies from 14–61% for grounding and 19–71% for solving. We observe that in the small and middle size models the grounding and solving proportion remains quite similar, while the grounding time proportion is much smaller than the solving one in the largescale model.
Biological interpretation

STAT6 0 \(\rightarrow\) 1 when IL4R=1

RORGT 0 \(\rightarrow\) 1 when BCL6=0 and FOXP3=0 and STAT3=1 and TGFBR=1

STAT1 0 \(\rightarrow\) 1 when IL27R=1

STAT1 0 \(\rightarrow\) 1 when IFNGR=1
The fact that these transitions are bifurcation transitions for FOXP3 means the following: starting from the specified initial state, there exists future states where the occurence of one of these transitions puts the system in a state where FOXP3 is no longer activable, and in particular, all future attractors have FOXP3 inactive. In that precise case, the active form of FOXP3 is a marker for the “Treg” Th subtype: hence, these 4 bifurcation transitions can prevent the differentiation of the cell in this type.
Conclusions
This paper presents an original combination of computational techniques to identify transitions of a dynamical system that can remove its capability to reach a (set of) states of interest. Our methodology combines static analysis of ANs dynamics, partial order representations of the state space, and constraint programming to efficiently enumerate those bifurcations. To our knowledge, this is the first integrated approach for deriving bifurcation transitions from concurrent models, and ANs in particular.
Bifurcations are key features of biological networks, as they model decisive transitions which control the differentiation of the cell: the bifurcations decide the portions of the state space (no longer) reachable in the longrun dynamics. Providing automatic methods for capturing those differentiations steps is of great interest for biological challenges such as cell reprogramming [8, 33], as they suggest targets for modulating undergoing cellular processes. Our approach is focused on nondeterministic discrete dynamics, in opposition to deterministic systems, such as piecewiseaffine systems on which differentiation is determined by the initial state in a continuous space [34].
Bifurcation transitions can be modelled as CTL properties and verified by exploring the reachable state and transition space. Our method aims at circumventing the state space explosion problem for large networks thanks to the formal approximations of reachability properties.
Given an initial state of the AN and a goal state, our method first computes static abstractions of the AN dynamics and (optionally) a symbolic representation of the reachable state space with socalled unfoldings. From those prior computations, a set of constraints is issued to identify bifurcation transitions. We used ASP to declare the admissible solutions and the solver clingo to obtain their efficient enumerations. For large models, the unfolding may be intractable: in such a case, the methods relies only on reachability over and underapproximations. By relying on those relaxations which can be efficiently encoded in ASP, our approach avoids costly exact checking, and is tractable on large models, as supported by the experiments.
For applications when the initial state is not fully determined, or equivalently, a set of initial states has to be considered, our approach, including CTL and approximations, can be easily extended for the identification of universal bifurcation transitions: such transitions are bifurcation transitions for every candidate initial state. Indeed, the verification of CTL properties is universal, as well as the implemented underapproximation of reachability \((\mathrm{I3}^\#)\). The unfolding prefix \((\mathrm{I3})\) can also be extended to multiple initial states [11]. The identification of existential bifurcation transitions, i.e., such that there exists at least one candidate initial state for which the transition is a bifurcation transition, could also be implemented for the approximation \((\mathrm{I3}^\#)\) using ASP, but with a potential lower scalability.
Further work will consider the complete identification of bifurcation transitions, by allowing false positives (but no false negatives). In combination with the underapproximation of the bifurcations presented in this paper, it will provide an efficient way to delineate all the transitions that control the reachability of the goal attractor. Moreover, we will investigate the implementation of refined over and underapproximations of reachability described in [13] for better capturing transition ordering constraints. Future work will also focus on exploiting the identified bifurcations for driving estimations of the probability of reaching the goal at steady state, in the scope of hybrid models of biological networks [35, 36].
Declarations
Authors' contributions
Research: LFF, OR, CG, LP. Tool implementation: LFF, CG, LP. Experiments: LFF, LP. Manuscript writing: LFF, OR, CG, LP. All authors read and approved the manuscript.
Acknowledgements
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Availability of data and materials
Models and scripts are provided in Additional file 2.
Funding
CG was supported by the CNRS chair of excellence funding. LP has been partially funded by ANRFNR project “AlgoReCell” (ANR16CE120034).
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Thomas R, d’Ari R. Biological feedback. Boca Raton: CRC Press; 1990.Google Scholar
 Plathe E, Mestl T, Omholt SW. Feedback loops, stability and multistationarity in dynamical systems. J Biol Syst. 1995;3:569–77.View ArticleGoogle Scholar
 Richard A. Negative circuits and sustained oscillations in asynchronous automata networks. Adv Appl Math. 2010;44(4):378–92.View ArticleGoogle Scholar
 Paulevé L, Richard A. Static analysis of Boolean networks based on interaction graphs: a survey. Electron Notes Theor Comput Sci. 2011;284:93–104. [Proceedings of the 2nd International Workshop on Static Analysis and Systems Biology (SASB 2011)] View ArticleGoogle Scholar
 Thomas R. Boolean formalization of genetic control circuits. J Theor Biol. 1973;42(3):563–85.View ArticlePubMedGoogle Scholar
 Bernot G, Cassez F, Comet JP, Delaplace F, Müller C, Roux O. Semantics of biological regulatory networks. Electron Notes Theor Comput Sci. 2007;180(3):3–14.View ArticleGoogle Scholar
 Sahin O, Frohlich H, Lobke C, Korf U, Burmester S, Majety M, Mattern J, Schupp I, Chaouiya C, Thieffry D, Poustka A, Wiemann S, Beissbarth T, Arlt D. Modeling ERBB receptorregulated G1/S transition to find novel targets for de novo trastuzumab resistance. BMC Syst Biol. 2009;3(1):1.View ArticlePubMedPubMed CentralGoogle Scholar
 AbouJaoudé W, Monteiro PT, Naldi A, Grandclaudon M, Soumelis V, Chaouiya C, Thieffry D. Model checking to assess thelper cell plasticity. Front Bioeng Biotechnol. 2015;2:86.PubMedPubMed CentralGoogle Scholar
 Calzone L, Barillot E, Zinovyev A. Predicting genetic interactions from boolean models of biological networks. Integr Biol. 2015;7(8):921–9.View ArticleGoogle Scholar
 Baral C. Knowledge representation, reasoning and declarative problem solving. New York: Cambridge University Press; 2003.View ArticleGoogle Scholar
 Chatain T, Haar S, Jezequel L, Paulevé L, Schwoon S. Characterization of reachable attractors using Petri net unfoldings. In: Mendes P, Dada J, Smallbone K, editors. Computational methods in systems biology, vol. 8859. Lecture notes in computer science. Cham: Springer; 2014. p. 129–42.Google Scholar
 Cheng A, Esparza J, Palsberg J. Complexity results for 1safe nets. Theor Comput Sci. 1995;147(1&2):117–36.View ArticleGoogle Scholar
 Paulevé L, Magnin M, Roux O. Static analysis of biological regulatory networks dynamics using abstract interpretation. Math Struct Comput Sci. 2012;22(04):651–85.View ArticleGoogle Scholar
 Folschette M, Paulevé L, Magnin M, Roux O. Sufficient conditions for reachability in automata networks with priorities. Theor Comput Sci. 2015;608(Part 1):66–83 (From Computer Science to Biology and Back).View ArticleGoogle Scholar
 Klamt S, Gilles ED. Minimal cut sets in biochemical reaction networks. Bioinformatics. 2004;20(2):226–34.View ArticlePubMedGoogle Scholar
 Samaga R, Kamp AV, Klamt S. Computing combinatorial intervention strategies and failure modes in signaling networks. J Comput Biol. 2010;17(1):39–53.View ArticlePubMedGoogle Scholar
 Paulevé L, Andrieux G, Koeppl H. Underapproximating cut sets for reachability in large scale automata networks. In: Sharygina N, Veith H, editors. Computer aided verification, vol. 8044. Lecture notes in computer science. Berlin: Springer; 2013. p. 69–84.View ArticleGoogle Scholar
 Acuna V, Chierichetti F, Lacroix V, MarchettiSpaccamela A, Sagot MF, Stougie L. Modes and cuts in metabolic networks: complexity and algorithms. Biosystems. 2009;95(1):51–60.View ArticlePubMedGoogle Scholar
 Hamez A, ThierryMieg Y, Kordon F. Building efficient model checkers using hierarchical set decision diagrams and automatic saturation. Fundam Inform. 2009;94(3–4):413–37.Google Scholar
 Cimatti A, Clarke E, Giunchiglia E, Giunchiglia F, Pistore M, Roveri M, Sebastiani R, Tacchella A. NuSMV 2: an opensource tool for symbolic model checking. Computer aided verification, vol. 2404. Lecture notes in computer science. Berlin: Springer; 2002. p. 241–68.Google Scholar
 Esparza J, Heljanko K. Unfoldings—a partialorder approach to model checking. Berlin: Springer; 2008.Google Scholar
 Lin F, Zhao Y. Assat: computing answer sets of a logic program by sat solvers. Artif Intell. 2004;157(1):115–37.View ArticleGoogle Scholar
 Gelfond M, Lifschitz V. The stable model semantics for logic programming. ICLP/SLP. 1988;88:1070–80.Google Scholar
 Gebser M, Kaminski R, Kaufmann B, Schaub T. Answer set solving in practice. Synthesis lectures on artificial intelligence and machine learning. San Rafael: Morgan and Claypool Publishers; 2012.Google Scholar
 Clarke EM, Emerson EA, Sistla AP. Automatic verification of finitestate concurrent systems using temporal logic specifications. ACM Trans Program Lang Syst. 1986;8(2):244–63.View ArticleGoogle Scholar
 Esparza J, Schröter C. Unfolding based algorithms for the reachability problem. Fund Inform. 2001;47(3–4):231–45.Google Scholar
 Heljanko K. Using logic programs with stable model semantics to solve deadlock and reachability problems for 1safe petri nets. Fund Inform. 1999;37(3):247–68.Google Scholar
 Thieffry D, Thomas R. Dynamical behaviour of biological regulatory networksII. Immunity control in bacteriophage lambda. Bull Math Biol. 1995;57:277–97.PubMedGoogle Scholar
 MacNamara A, Terfve C, Henriques D, Bernabé BP, SaezRodriguez J. Statetime spectrum of signal transduction logic models. Phys Biol. 2012;9(4):045003.View ArticlePubMedGoogle Scholar
 Chaouiya C, Bérenguier D, Keating SM, Naldi A, van Iersel MP, Rodriguez N, Dräger A, Büchel F, Cokelaer T, Kowal B, Wicks B, Gonçalves E, Dorier J, Page M, Monteiro PT, von Kamp A, Xenarios I, de Jong H, Hucka M, Klamt S, Thieffry D, Le Novère N, SaezRodriguez J, Helikar T. SBML qualitative models: a model representation format and infrastructure to foster interactions between qualitative modelling formalisms and tools. BMC Syst Biol. 2013;7(1):1–15.View ArticleGoogle Scholar
 Gebser M, Kaminski R, Kaufmann B, Schaub T. Clingo = ASP + control: preliminary report. In: Leuschel M, Schrijvers T, editors. Technical communications of the thirtieth international conference on logic programming (ICLP’14), arXiv:1405.3694v1. Theory and practice of logic programming, online supplement. 2014.
 Schwoon S. Mole. http://www.lsv.enscachan.fr/~schwoon/tools/mole/. Accessed 1 July 2017.
 Crespo I, del Sol A. A general strategy for cellular reprogramming: the importance of transcription factor crossrepression. Stem Cells. 2013;31(10):2127–35.View ArticlePubMedGoogle Scholar
 Yordanov B, Batt G, Belta C. Model checking discretetime piecewise affine systems: application to gene networks. In: Control Conference (ECC), 2007 European. New York: IEEE; 2007. p. 2619–26.Google Scholar
 Shmulevich I, Dougherty ER, Kim S, Zhang W. Probabilistic boolean networks: a rulebased uncertainty model for gene regulatory networks. Bioinformatics. 2002;18(2):261–74.View ArticlePubMedGoogle Scholar
 Stoll G, Viara E, Barillot E, Calzone L. Continuous time boolean modeling for biological signaling: application of Gillespie algorithm. BMC Syst Biol. 2012;6(1):116.View ArticlePubMedPubMed CentralGoogle Scholar